我如何忽略ggplot2 boxplot中的异常值?我不是简单地希望它们消失(即outlier.size = 0),但是我希望它们被忽略,以便y轴缩放以显示第1 /第3百分位数.我的异常值导致"盒子"缩小,实际上是一条线.有一些技巧可以解决这个问题吗?
编辑 这是一个例子:
y = c(.01, .02, .03, .04, .05, .06, .07, .08, .09, .5, -.6)
qplot(1, y, geom="boxplot")
Run Code Online (Sandbox Code Playgroud)
Ric*_*ton 179
使用geom_boxplot(outlier.shape = NA)
不显示异常值和scale_y_continuous(limits = c(lower, upper))
改变轴界限.
一个例子.
n <- 1e4L
dfr <- data.frame(
y = exp(rlnorm(n)), #really right-skewed variable
f = gl(2, n / 2)
)
p <- ggplot(dfr, aes(f, y)) +
geom_boxplot()
p # big outlier causes quartiles to look too slim
p2 <- ggplot(dfr, aes(f, y)) +
geom_boxplot(outlier.shape = NA) +
scale_y_continuous(limits = quantile(dfr$y, c(0.1, 0.9)))
p2 # no outliers plotted, range shifted
Run Code Online (Sandbox Code Playgroud)
实际上,正如Ramnath在他的回答中所显示的那样(Andrie也在评论中),在计算统计数据后,通过调整比例更有意义coord_cartesian
.
coord_cartesian(ylim = quantile(dfr$y, c(0.1, 0.9)))
Run Code Online (Sandbox Code Playgroud)
(您可能仍需要使用scale_y_continuous
来修复轴断裂.)
Ram*_*ath 133
这是使用boxplot.stats的解决方案
# create a dummy data frame with outliers
df = data.frame(y = c(-100, rnorm(100), 100))
# create boxplot that includes outliers
p0 = ggplot(df, aes(y = y)) + geom_boxplot(aes(x = factor(1)))
# compute lower and upper whiskers
ylim1 = boxplot.stats(df$y)$stats[c(1, 5)]
# scale y limits based on ylim1
p1 = p0 + coord_cartesian(ylim = ylim1*1.05)
Run Code Online (Sandbox Code Playgroud)
Mat*_*unz 14
我遇到了同样的问题,并使用以下方法预先计算了Q1,Q2,中位数,ymin,ymax的值boxplot.stats
:
# Load package and generate data
library(ggplot2)
data <- rnorm(100)
# Compute boxplot statistics
stats <- boxplot.stats(data)$stats
df <- data.frame(x="label1", ymin=stats[1], lower=stats[2], middle=stats[3],
upper=stats[4], ymax=stats[5])
# Create plot
p <- ggplot(df, aes(x=x, lower=lower, upper=upper, middle=middle, ymin=ymin,
ymax=ymax)) +
geom_boxplot(stat="identity")
p
Run Code Online (Sandbox Code Playgroud)
gg.layers::geom_boxplot2
正是您想要的。
# remotes::install_github('rpkgs/gg.layers')
library(gg.layers)
library(ggplot2)
p <- ggplot(mpg, aes(class, hwy))
p + geom_boxplot2(width = 0.8, width.errorbar = 0.5)
Run Code Online (Sandbox Code Playgroud)
https://rpkgs.github.io/gg.layers/reference/geom_boxplot2.html
如果您想强制胡须延伸到最大值和最小值,您可以调整参数coef
。默认值为coef
1.5(即晶须的默认长度是 IQR 的 1.5 倍)。
# Load package and create a dummy data frame with outliers
#(using example from Ramnath's answer above)
library(ggplot2)
df = data.frame(y = c(-100, rnorm(100), 100))
# create boxplot that includes outliers
p0 = ggplot(df, aes(y = y)) + geom_boxplot(aes(x = factor(1)))
# create boxplot where whiskers extend to max and min values
p1 = ggplot(df, aes(y = y)) + geom_boxplot(aes(x = factor(1)), coef = 500)
Run Code Online (Sandbox Code Playgroud)