列配对统计检验

Question

列配对统计检验

我有两个data.frames,看起来像:

DF1      
  Col1     Col2     Col3    Col4    
 0.1854   0.1660   0.1997   0.4632
 0.1760   0.1336   0.1985   0.4496
 0.1737   0.1316   0.1943   0.4446    
 0.1660   0.1300   0.1896   0.4439


DF2       
  Col1     Col2     Col3    Col4    
 0.2456    0.2107   0.2688  0.5079
 0.2399    0.1952   0.2356  0.1143
 0.2375    0.1947   0.2187  0.0846    
 0.2368    0.1922   0.2087  0.1247

Run Code Online (Sandbox Code Playgroud)

我想在两个data.frames之间执行wilcox.test,特别是在配对列之间执行wilcox.test,以便:

test1: between Col1 of DF1 and Col1 of DF2     
test2: between Col2 of DF1 and Col2 of DF2

Run Code Online (Sandbox Code Playgroud)

等等.

我使用了以下脚本:

for (i in 1:length(DF2)){ 
    test <- apply(DF1, 2, function(x) wilcox.test(x, as.numeric(DF2[[i]]), correct=TRUE))
}

Run Code Online (Sandbox Code Playgroud)

不幸的是,此脚本的输出与使用以下脚本执行的相同测试的输出不同:

test1 = wilcox.test(DF1[,1], DF2[,1],  correct=FALSE)     
test2 = wilcox.test(DF1[,2], DF2[,2],  correct=FALSE)

Run Code Online (Sandbox Code Playgroud)

因为在真实的data.frames中我有大约100列和200行(它们相对于维度而言)我不能按列制作测试列.

之后dput(DF1):

structure(list(Col1 = c(0.1854, 0.1760, 0.1737, 0.1660,....),  class = "data.frame", row.names = c(NA, -100L)))

Run Code Online (Sandbox Code Playgroud)

同样的 DF2

Answer 1

csg*_*pie 6

这是一个典型mapply案例 - 基本上只是一个多变量版本sapply.我们使用mapply依次遍历每个数据框.首先,创建一些数据:

df1 = data.frame(c1 = runif(10), c2 = runif(10), c3 = runif(10), c4 = runif(10))
df2 = data.frame(c1 = runif(10), c2 = runif(10), c3 = runif(10), c4 = runif(10))

Run Code Online (Sandbox Code Playgroud)

然后用 mapply

l = mapply(wilcox.test, df1, df2, SIMPLIFY=FALSE, correct=FALSE)

Run Code Online (Sandbox Code Playgroud)

这里变量l是一个列表.所以,

wilcox.test(df1[,1], df2[,1],  correct=FALSE) 
l[[1]]
wilcox.test(df1[,2], df2[,2],  correct=FALSE) 
l[[2]]

Run Code Online (Sandbox Code Playgroud)

归档时间：	12 年，8 月前
查看次数：	94 次
最近记录：	12 年，8 月前