我正在尝试将样本大小添加到按两个级别分组的箱线图(最好在箱线图的顶部或底部)。我使用facet_grid()函数来生成面板图。然后,我尝试使用 annotate() 函数来添加样本大小,但这不起作用,因为它重复了第二个面板中的值。有没有一种简单的方法可以做到这一点?
\n\nhead(FeatherData, n=10)\n Location Status FeatherD Species ID\n## 1 TX Resident -27.41495 Carolina wren CARW (32)\n## 2 TX Resident -29.17626 Carolina wren CARW (32)\n## 3 TX Resident -31.08070 Carolina wren CARW (32)\n## 4 TX Migrant -169.19579 Yellow-rumped warbler YRWA (28)\n## 5 TX Migrant -170.42079 Yellow-rumped warbler YRWA (28)\n## 6 TX Migrant -158.66925 Yellow-rumped warbler YRWA (28)\n## 7 TX Migrant -165.55278 Yellow-rumped warbler YRWA (28)\n## 8 TX Migrant -170.43374 Yellow-rumped warbler YRWA (28)\n## 9 TX Migrant -170.21801 Yellow-rumped warbler YRWA (28)\n## 10 TX Migrant -184.45871 Yellow-rumped warbler YRWA (28)\n\n\n ggplot(FeatherData, aes(x = Location, y = FeatherD)) +\n geom_boxplot(alpha = 0.7, fill=\'#A4A4A4\') +\n scale_y_continuous() +\n scale_x_discrete(name = "Location") +\n theme_bw() +\n theme(plot.title = element_text(size = 20, family = "Times", face = \n "bold"),\n text = element_text(size = 20, family = "Times"),\n axis.title = element_text(face="bold"),\n axis.text.x=element_text(size = 15)) +\n ylab(expression(Feather~delta^2~H["f"]~"\xe2\x80\xb0")) +\n facet_grid(. ~ Status)\nRun Code Online (Sandbox Code Playgroud)\n\n\n
有多种方法可以完成此类任务。最灵活的方法是在绘图调用之外计算统计数据作为单独的数据帧并将其用作自己的层:
library(dplyr)
library(ggplot2)
cw_summary <- ChickWeight %>%
group_by(Diet) %>%
tally()
cw_summary
Run Code Online (Sandbox Code Playgroud)
Run Code Online (Sandbox Code Playgroud)# A tibble: 4 x 2 Diet n <fctr> <int> 1 1 220 2 2 120 3 3 120 4 4 118
ggplot(ChickWeight, aes(Diet, weight)) +
geom_boxplot() +
facet_grid(~Diet) +
geom_text(data = cw_summary,
aes(Diet, Inf, label = n), vjust = 1)
Run Code Online (Sandbox Code Playgroud)
另一种方法是使用内置的汇总函数,但这可能很繁琐。这是一个例子:
ggplot(ChickWeight, aes(Diet, weight)) +
geom_boxplot() +
stat_summary(fun.y = median, fun.ymax = length,
geom = "text", aes(label = ..ymax..), vjust = -1) +
facet_grid(~Diet)
Run Code Online (Sandbox Code Playgroud)
在这里,我用于fun.y将摘要定位在 y 值的中位数,并用于使用该函数(仅计算观测值的数量)fun.ymax计算一个名为..ymax..的内部变量。length