Ind*_*til 2 label r histogram ggplot2 tidyverse
我想创建一个直方图,其中有一条垂直线表示平均值,附加到该线的标签给出了平均值的确切值.
我可以轻松地创建一个垂直线的基本直方图.
# needed library
library(ggplot2)
# mean to be used later
x_mean <- mean(x = iris$Sepal.Length, na.rm = TRUE)
# creating basic plot with line for mean
(
plot <- ggplot(data = iris,
mapping = aes(x = Sepal.Length)) +
stat_bin(
col = "black",
alpha = 0.7,
na.rm = TRUE,
mapping = aes(y = ..count..,
fill = ..count..)
) +
geom_vline(
xintercept = x_mean,
linetype = "dashed",
color = "red",
na.rm = TRUE
) +
scale_fill_gradient(name = "count",
low = "white",
high = "white") +
guides(fill = FALSE)
)
#> `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
Run Code Online (Sandbox Code Playgroud)

现在我可以使用以下代码为此行添加标签:
# adding label to the line
plot +
geom_label(mapping = aes(
label = list(bquote("mean" == ~ .(
format(round(x_mean, 2), nsmall = 2)
))),
x = x_mean,
y = 5 # how to automate this value choice?
),
parse = TRUE)
#> `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
Run Code Online (Sandbox Code Playgroud)

现在问题在于我正在y为geom_label(y = 5)的-value进行硬编码.这并不理想,因为如果我更改数据或变量或binwidth,y = 5将不再是y轴的(近似)中间.我尝试过设置y = max(..count..)/2,但这会导致以下错误:
FUN中的错误(X [[i]],...):找不到对象'count'
总结一下:在这种情况下如何自动选择y值,geom_label以便无论计数范围如何,标签总是位于Y轴的中间?
您可以plot通过替换代码y = 5中的硬编码来获取当前的y轴范围y = mean(layer_scales(plot)$y$range$range).
这样,如果参数改变,则考虑尺度的变化.
# layer_scales(plot) gives the scale information for plot
> layer_scales(plot)$y
`stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
<ScaleContinuousPosition>
Range: 0 -- 12
Limits: 0 -- 12
# this is the actual vector for y-axis scale range
> layer_scales(plot)$y$range$range
`stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
[1] 0 12
# this is the y-axis midpoint value
> mean(layer_scales(plot)$y$range$range)
`stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
[1] 6
Run Code Online (Sandbox Code Playgroud)