如何在ggplot直方图中添加均值和模式?

Bor*_*042 8 r mode histogram mean ggplot2

我需要添加一个平均线和模式的值,例如这种情节:

我用这个来计算垃圾箱的数量:

bw <- diff(range(cars$lenght)) / (2 * IQR(cars$lenght) / length(cars$lenght)^(1/3))
Run Code Online (Sandbox Code Playgroud)

情节:

ggplot(data=cars, aes(cars$lenght)) + 
  geom_histogram(aes(y =..density..), 
                 col="red",
                 binwidth = bw,
                 fill="green", 
                 alpha=1) + 
  geom_density(col=4) + 
  labs(title='Lenght Plot', x='Lenght', y='Times')

cars$lenght
Run Code Online (Sandbox Code Playgroud)

168.8 168.8 171.2 176.6 176.6 177.3 192.7 192.7 192.7 178.2 176.8 176.8 176.8 176.8 189.0 189.0 193.8 197.0 141.1 155.9 158.8 157.3 157.3 157.3 157.3 157.3 157.3 157.3 174.6 173.2

提前致谢.

dul*_*aux 14

我不确定如何复制您的数据,所以我用cars$speed它来代替它.

geom_vline将垂直线放在您想要的位置,您可以动态计算原始数据的平均值和模式.但是,如果您希望模式为具有最高频率的直方图bin,则可以从ggplot对象中提取该模式.

我不太确定你想要如何定义模式,所以我绘制了一堆不同的方法.

# function to calculate mode
fun.mode<-function(x){as.numeric(names(sort(-table(x)))[1])}

bw <- diff(range(cars$length)) / (2 * IQR(cars$speed) / length(cars$speed)^(1/3))
p<-ggplot(data=cars, aes(cars$speed)) + 
  geom_histogram(aes(y =..density..), 
                 col="red",
                 binwidth = bw,
                 fill="green", 
                 alpha=1) + 
  geom_density(col=4) + 
  labs(title='Lenght Plot', x='Lenght', y='Times')

# Extract data for the histogram and density peaks
data<-ggplot_build(p)$data
hist_peak<-data[[1]]%>%filter(y==max(y))%>%.$x
dens_peak<-data[[2]]%>%filter(y==max(y))%>%.$x

# plot mean, mode, histogram peak and density peak
p%+%
  geom_vline(aes(xintercept = mean(speed)),col='red',size=2)+
  geom_vline(aes(xintercept = fun.mode(speed)),col='blue',size=2)+
  geom_vline(aes(xintercept = hist_peak),col='orange',size=2)+
  geom_vline(aes(xintercept = dens_peak),col='purple',size=2)+
  geom_text(aes(label=round(hist_peak,1),y=0,x=hist_peak),
            vjust=-1,col='orange',size=5)
Run Code Online (Sandbox Code Playgroud)

在此输入图像描述