基于`..count..`变量的ggplot2中`geom_label`的y值

Ind*_*til 2 label r histogram ggplot2 tidyverse

我想创建一个直方图,其中有一条垂直线表示平均值,附加到该线的标签给出了平均值的确切值.

我可以轻松地创建一个垂直线的基本直方图.

# needed library
library(ggplot2)

# mean to be used later
x_mean <- mean(x = iris$Sepal.Length, na.rm = TRUE)

# creating basic plot with line for mean
(
  plot <- ggplot(data = iris,
                 mapping = aes(x = Sepal.Length)) +
    stat_bin(
      col = "black",
      alpha = 0.7,
      na.rm = TRUE,
      mapping = aes(y = ..count..,
                    fill = ..count..)
    )  +
    geom_vline(
      xintercept = x_mean,
      linetype = "dashed",
      color = "red",
      na.rm = TRUE
    ) +
    scale_fill_gradient(name = "count",
                        low = "white",
                        high = "white") +
    guides(fill = FALSE)
)
#> `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
Run Code Online (Sandbox Code Playgroud)

现在我可以使用以下代码为此行添加标签:

# adding label to the line
plot +
  geom_label(mapping = aes(
    label = list(bquote("mean" == ~ .(
      format(round(x_mean, 2), nsmall = 2)
    ))),
    x = x_mean,
    y = 5  # how to automate this value choice?
  ),
  parse = TRUE)
#> `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
Run Code Online (Sandbox Code Playgroud)

现在问题在于我正在ygeom_label(y = 5)的-value进行硬编码.这并不理想,因为如果我更改数据或变量或binwidth,y = 5将不再是y轴的(近似)中间.我尝试过设置y = max(..count..)/2,但这会导致以下错误:

FUN中的错误(X [[i]],...):找不到对象'count'

总结一下:在这种情况下如何自动选择y值,geom_label以便无论计数范围如何,标签总是位于Y轴的中间?

Z.L*_*Lin 5

您可以plot通过替换代码y = 5中的硬编码来获取当前的y轴范围y = mean(layer_scales(plot)$y$range$range).

这样,如果参数改变,则考虑尺度的变化.

# layer_scales(plot) gives the scale information for plot
> layer_scales(plot)$y
`stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
<ScaleContinuousPosition>
 Range:     0 --   12
 Limits:    0 --   12

# this is the actual vector for y-axis scale range
> layer_scales(plot)$y$range$range
`stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
[1]  0 12

# this is the y-axis midpoint value
> mean(layer_scales(plot)$y$range$range)
`stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
[1] 6
Run Code Online (Sandbox Code Playgroud)