我想做一些非常简单的事情:我想为一个完整的数据帧创建一个 boxplot.然而,搜索"组合箱图"和相关术语并没有提出任何建议.如果我忽略了一种显而易见的方式,请告诉我.
我有以下数据:
> theData
X20.7 X21.7 X22.7 X23.7 X24.7 X25.7 X26.7 X27.7 X28.7 X29.7 X30.7 X31.7 X32.7 X33.7 X34.7 X35.7
1 99.64920 99.49319 99.49319 99.49319 99.49319 99.49319 99.80837 99.29348 99.29348 99.29348 99.29348 99.29348 99.29348 99.46376 99.46376 99.51554
2 98.76469 98.60867 98.60867 98.60867 98.60867 98.60867 99.41553 98.40896 98.40896 98.40896 98.40896 98.40896 98.40896 98.74975 98.74975 98.54527
3 98.37824 98.22222 98.22222 98.22222 98.22222 98.22222 98.70900 98.13767 98.13767 98.13767 98.13767 98.13767 98.13767 98.47846 98.47846 98.01791
4 98.11356 97.95754 97.95754 97.95754 97.95754 97.95754 …Run Code Online (Sandbox Code Playgroud) 我需要从具有三个数字列的data.frame创建一个箱形图,并使用split参数通过paint分隔这些框。我有一个很大的data.frame,但是我需要的是下面的示例:
paint<-c("blue", "black", "red", "blue", "black", "red", "blue", "black", "red")
car1<-c(100, 138, 123, 143, 112, 144, 343, 112, 334)
car2<-c(111, 238, 323, 541, 328, 363, 411, 238, 313)
car3<-c(432, 123, 322, 342, 323, 522, 334, 311, 452)
data<-data.frame(paint, car1, car2, car3)
>data
paint car1 car2 car3
1 blue 100 111 432
2 black 138 238 123
3 red 123 323 322
4 blue 143 541 342
5 black 112 328 323
6 red 144 …Run Code Online (Sandbox Code Playgroud) 我试图使用MATLAB从箱线图中识别异常值.该函数的默认晶须值为1.5,提供+ - 2.7*sigma或99.3覆盖率.但是,我想要99.7或3*sigma覆盖.在这种情况下,晶须的价值是什么?我不想随意猜测,所以需要你们的帮助.谢谢
我有一个pandas数据框,如下所示:
[('1975801_m', 1 0.203244
10 -0.159756
16 -0.172756
19 -0.089756
20 -0.033756
23 -0.011756
24 0.177244
32 0.138244
35 -0.104756
36 0.157244
40 0.108244
41 0.032244
42 0.063244
45 0.362244
59 -0.093756
62 -0.070756
65 -0.030756
66 -0.100756
73 -0.140756
77 -0.110756
81 -0.100756
84 -0.090756
86 -0.180756
87 0.119244
88 0.709244
102 -0.030756
105 -0.000756
107 -0.010756
109 0.039244
111 0.059244
Name: RTdiff), ('3878418_m', 1637 0.13811
1638 -0.21489
1644 -0.15989
1657 -0.11189
1662 -0.03289
1666 -0.09489
1669 0.03411
1675 …Run Code Online (Sandbox Code Playgroud) Box和Whisker图表显示以下信息:max,min,mean,75th百分位数,第25百分位数.如果我有这些信息,我可以绘制相应的B&W图吗?
我有这个名为TP.df的数据框:
pb1 ag1 pb2 ag2 pb3 ag3
Nb 498 498 85 85 68 68
Min 0 0 0 0 0 0
Max 1.72 461 2.641 260.8 0.3 144
Mean 0.06 19.2 0.15 35.35 0.02 9.11
75_p 0.06 20 0.08 33 0.02 8
25_p 0.01 10 0 14 0.01 4
Run Code Online (Sandbox Code Playgroud)
文件:
,pb1,ag1,pb2,ag2,pb3,ag3
Nb,498,498,85,85,68,68
Min,0,0,0,0,0,0
Max,1.72,461,2.641,260.8,0.3,144
Mean,0.06,19.2,0.15,35.35,0.02,9.11
75_p,0.06,20,0.08,33,0.02,8
25_p,0.01,10,0,14,0.01,4
Run Code Online (Sandbox Code Playgroud)
如何获得相应的Box和Whisker图:
pb1,ag1,pb2,ag2,pb3,ag3 0到max(TP.df[Max,])我可以做到这一点而没有问题:
boxplot(coef ~ habitat, data = res)
abline(h = 0, col = "green")
Run Code Online (Sandbox Code Playgroud)
但是当我使用点阵时,水平线放错了位置:
bwplot(coef ~ habitat, data = res)
abline(h = 0, col = "green")
Run Code Online (Sandbox Code Playgroud)

我尝试使用它panel.abline代替,但是将绿线放在图片的顶部。
我有一个如下所示的数据框:
> head(DOData)
Date Site1 Site2 Site3 Site4 Site5 Months
1 1/1/2012 1.07 3.32 11.35 6.26 5.39 January
2 1/2/2012 1.24 3.08 10.69 6.57 6.59 January
3 1/3/2012 1.94 2.69 11.86 6.23 6.23 January
4 1/4/2012 0.81 3.50 11.47 4.67 5.94 January
5 1/5/2012 1.41 3.11 10.38 7.44 5.40 January
6 1/6/2012 2.73 3.28 11.11 6.15 6.22 January
.
.
.
361 12/26/2012 3.54 3.86 12.67 5.44 6.03 December
362 12/27/2012 2.05 3.42 10.27 6.05 7.10 December
363 12/28/2012 3.59 …Run Code Online (Sandbox Code Playgroud) 我试图在matplotlib中计算箱形图的胡须和框坐标.我不明白我的错误以及为什么我不计算相同的值.
Q1, median, Q3 = np.percentile(becher, [25, 50, 75])
IQR = Q3 - Q1
Qs = [Q1, median, Q3, Q1 - 1.5 * IQR, Q3 + 1.5 * IQR]
Qname = ["Q1", "median", "Q3", "Q1-1.5xIQR", "Q3+1.5xIQR"]
for Q, name in zip(Qs, Qname):
plt.axhline(Q, color="k")
plt.text(1.52, Q, name)
plt.boxplot(becher)
Run Code Online (Sandbox Code Playgroud)
如下图所示,Q1,Q3和中位数都可以.但胡须是错的.
这是我的数据:
becher = [9.1495,
9.9479,
9.7933,
9.8002,
8.47,
9.14,
9.06,
9.6933,
9.7871,
10.5676,
9.7441,
10.4874,
7.9584,
7.9598,
8.3483,
7.2536,
9.0823,
10.8343,
10.4104,
7.2004,
9.6297,
9.96,
9.761,
9.684,
8.6062,
10.2098,
8.9002,
8.4511, …Run Code Online (Sandbox Code Playgroud) 我正在研究一个带有预测和观测的箱线图,这是一个非常长的数据集.我在这里提供样本格式.
> forecasts <- data.frame(f_type = c(rep("A", 9), rep("B", 9)),
Date = c(rep(as.Date("2007-01-31"),3), rep(as.Date("2007-02-28"), 3), rep(as.Date("2007-03-31"), 3), rep(as.Date("2007-01-31"), 3), rep(as.Date("2007-02-28"), 3), rep(as.Date("2007-03-31"), 3)),
value = c(10, 50, 60, 05, 90, 20, 30, 46, 39, 69, 82, 48, 65, 99, 75, 15 ,49, 27))
>
> observation <- data.frame(Dt = c(as.Date("2007-01-31"), as.Date("2007-02-28"), as.Date("2007-03-31")),
obs = c(30,49,57))
Run Code Online (Sandbox Code Playgroud)
到目前为止,我有:
ggplot() +
geom_boxplot(data = forecasts,
aes(x = as.factor(Date), y = value,
group = interaction(Date, f_type), fill = f_type)) +
geom_line(data = observations,
aes(x = …Run Code Online (Sandbox Code Playgroud) 我有一个tibble名为的数据框my_data,如下所示:
> my_data
# A tibble: 60 x 4
SPECIES simulation_id psi_hat p_hat
<chr> <int> <dbl> <dbl>
1 Grey squirrel 74 0.527 0.306
2 Grey squirrel 102 0.526 0.316
3 Grey squirrel 142 0.527 0.309
4 Grey squirrel 121 0.527 0.309
5 Grey squirrel 25 0.526 0.317
6 Grey squirrel 50 0.527 0.309
7 Grey squirrel 67 0.491 0.326
8 Grey squirrel 19 0.527 0.306
9 Grey squirrel 174 0.527 0.302
10 Grey squirrel 46 0.527 …Run Code Online (Sandbox Code Playgroud)