在密度分布上绘制中位数

Question

在密度分布上绘制中位数

我正在尝试使用ggplot2 R库在密度分布上绘制某些数据的中值。我想将中间值作为文本打印在密度图的顶部。

您将看到一个示例的意思（使用“钻石”默认数据框）：

我正在打印三个项目：密度图本身，一条垂直线，显示每个切割的中位数价格，以及带有该值的文本标签。但是，正如您所看到的，中位数价格在“ y”轴上重叠（这种美感在geom_text（）函数中是必需的）。

有没有办法为每个中间价格动态分配一个“ y”值，以便在不同的高度打印它们？例如，每个“切口”的最大密度值。

到目前为止，我已经知道了

# input dataframe
dia <- diamonds

# calculate mean values of each numerical variable:
library(plyr)
dia_me <- ddply(dia, .(cut), numcolwise(median))

ggplot(dia, aes(x=price, y=..density.., color = cut, fill = cut), legend=TRUE) +
  labs(title="diamond price per cut") +
  geom_density(alpha = 0.2) +
  geom_vline(data=dia_me, aes(xintercept=price, colour=cut),
             linetype="dashed", size=0.5) +
  scale_x_log10() +
  geom_text(data = dia_me, aes(label = price, y=1, x=price))

Run Code Online (Sandbox Code Playgroud)

（我为geom_text函数中的y美感分配了一个常量值，因为它是强制性的）

Answer 1

Her*_*oka 6

这可能只是一个开始（但是由于颜色，它不太可读）。我的想法是在用于绘制中位数线的数据中创建一个“ y”位置。这有点武断，但我希望y位置在0.2到1之间（以很好地适合绘图）。我是通过sequence-command完成的。然后，我尝试按中位数价格订购（效果不佳）。这是任意的。

#scatter y-pos over plot
dia_me$y_pos <- seq(0.2,1,length.out=nrow(dia_me))[order(dia_me$price,decreasing = T)]


ggplot(dia, aes(x=price, y=..density.., color = cut, fill = cut), legend=TRUE) +
  labs(title="diamond price per cut") +
  geom_density(alpha = 0.2) +
  geom_vline(data=dia_me, aes(xintercept=price, colour=cut),
             linetype="dashed", size=0.5) +
  scale_x_log10() +
  geom_text(data = dia_me, aes(label = price, y=y_pos, x=price))

Run Code Online (Sandbox Code Playgroud)

您还可以通过以下代码使用密度的最大值：`dia_me $ y_pos <-gregation（log10（price）〜cut，dia，function（x）max（density（x）$ y））[，2]` (2认同)

归档时间：	9 年，12 月前
查看次数：	6286 次
最近记录：	6 年，3 月前