nev*_*int 0 statistics r statistical-test
我有以下数据列表,每个都有10个样本.该值表示特定分子的结合强度.
我想要表示的是'x'在统计上与'y','z'和'w'不同.如果你看X它会有更多的值比其他值大于零(2.8,1.00,5.4等).
我尝试了t检验,但所有这些都显示出与高P值无显着差异.
什么是适当的测试?
以下是我的代码:
#!/usr/bin/Rscript
x <-c(2.852672123,0.076840264,1.009542943,0.430716968,5.4016,0.084281843,0.065654548,0.971907344,3.325405405,0.606504718)
y <- c(0.122615039,0.844203734,0.002128992,0.628740077,0.87752229,0.888600425,0.728667099,0.000375047,0.911153571,0.553786408);
z <- c(0.766445916,0.726801899,0.389718652,0.978733927,0.405585807,0.408554832,0.799010791,0.737676439,0.433279599,0.947906524)
w <- c(0.000124984,1.486637663,0.979713013,0.917105894,0.660855127,0.338574774,0.211689885,0.434050179,0.955522972,0.014195184)
t.test(x,y)
t.test(x,z)
Run Code Online (Sandbox Code Playgroud)
您没有指定样品的预期方式不同.通常假设您的意思是不同样本的平均值不同.在这种情况下,t检验是合适的.虽然x有一些较高的值,但它也有一些较低的值可以拉平均值.看起来你认为是一个显着的差异(视觉上)实际上是一个更大的方差.
如果您的问题是关于方差,那么您需要进行F检验.
这类数据的经典测试是方差分析.方差分析告诉您所有四个类别的均值是否可能相同(未能拒绝零假设)或者至少有一个均值可能与其他类别不同(拒绝零假设).
如果anova很重要,您通常会想要执行Tukey HSD事后测试,以确定哪个类别与其他类别不同.Tukey HSD产生的p值已经针对多重比较进行了调整.
library(ggplot2)
library(reshape2)
x <- c(2.852672123,0.076840264,1.009542943,0.430716968,5.4016,0.084281843,
0.065654548,0.971907344,3.325405405,0.606504718)
y <- c(0.122615039,0.844203734,0.002128992,0.628740077,0.87752229,
0.888600425,0.728667099,0.000375047,0.911153571,0.553786408);
z <- c(0.766445916,0.726801899,0.389718652,0.978733927,0.405585807,
0.408554832,0.799010791,0.737676439,0.433279599,0.947906524)
w <- c(0.000124984,1.486637663,0.979713013,0.917105894,0.660855127,
0.338574774,0.211689885,0.434050179,0.955522972,0.014195184)
dat = data.frame(x, y, z, w)
mdat = melt(dat)
anova_results = aov(value ~ variable, data=mdat)
summary(anova_results)
# Df Sum Sq Mean Sq F value Pr(>F)
# variable 3 5.83 1.9431 2.134 0.113
# Residuals 36 32.78 0.9105
Run Code Online (Sandbox Code Playgroud)
anova p值为0.113,"x"类别的Tukey测试p值在相似范围内.这是你的直觉的量化,"x"与其他人不同.大多数研究人员会发现p = 0.11具有提示性,但仍然具有过高的假阳性风险.请注意,平均值(差异列)与下面的箱线图的较大差异可能比p值更具说服力.
TukeyHSD(anova_results)
# Tukey multiple comparisons of means
# 95% family-wise confidence level
#
# Fit: aov(formula = value ~ variable, data = mdat)
#
# $variable
# diff lwr upr p adj
# y-x -0.92673335 -2.076048 0.2225815 0.1506271
# z-x -0.82314118 -1.972456 0.3261737 0.2342515
# w-x -0.88266565 -2.031981 0.2666492 0.1828672
# z-y 0.10359217 -1.045723 1.2529071 0.9948795
# w-y 0.04406770 -1.105247 1.1933826 0.9995981
# w-z -0.05952447 -1.208839 1.0897904 0.9990129
plot_1 = ggplot(mdat, aes(x=variable, y=value, colour=variable)) +
geom_boxplot() +
geom_point(size=5, shape=1)
ggsave("plot_1.png", plot_1, height=3.5, width=7, units="in")
Run Code Online (Sandbox Code Playgroud)

| 归档时间: |
|
| 查看次数: |
978 次 |
| 最近记录: |