cor.test()的矩阵版本

Question

cor.test()的矩阵版本

Att*_*s29 25 r correlation

Cor.test()采用向量x和y参数,但我有一整个数据矩阵,我想成对测试.Cor()把这个矩阵作为一个参数就好了,我希望找到一种方法来做同样的事情cor.test().

其他人的共同建议似乎是使用cor.prob():

https://stat.ethz.ch/pipermail/r-help/2001-November/016201.html

但是这些p值与cor.test()!!! 生成的p值不同 Cor.test()也似乎更适合处理成对删除(我的数据集中有相当多的缺失数据)cor.prob().

有没有人有其他选择cor.prob()？如果解决方案涉及嵌套for循环,那么就是它(我已经足够新了,R因为即使这对我来说也是有问题的).

Answer 1

Sac*_*amp 39

corr.test在psych包中旨在这样做:

library("psych")
data(sat.act)
corr.test(sat.act)

Run Code Online (Sandbox Code Playgroud)

如注释中所述,要在整个矩阵中复制基函数的pcor.test()值,则需要关闭p值的调整以进行多次比较(默认是使用Holm的调整方法):

 corr.test(sat.act, adjust = "none")

Run Code Online (Sandbox Code Playgroud)

[但在解释这些结果时要小心!]

如果你想让结果匹配统计数据`cor.test`使用`corr.test(mtcars,adjust ="none"),请注意 (6认同)
美丽，为什么要重新发明轮子。+1克 (2认同)
如果你有一个大矩阵,这将非常非常慢!为了加快速度,设置参数`ci = F` - 大约需要运行cor()的两倍,而使用`ci = T`(默认值),可能需要100倍的时间. (2认同)

Answer 2

Tyl*_*ker 13

如果您严格遵循矩阵格式的pvalues,cor.test这是一个从Vincent(LINK)无耻地窃取的解决方案:

cor.test.p <- function(x){
    FUN <- function(x, y) cor.test(x, y)[["p.value"]]
    z <- outer(
      colnames(x), 
      colnames(x), 
      Vectorize(function(i,j) FUN(x[,i], x[,j]))
    )
    dimnames(z) <- list(colnames(x), colnames(x))
    z
}

cor.test.p(mtcars)

Run Code Online (Sandbox Code Playgroud)

注意:Tommy也提供了更快的解决方案,但不太容易实现.哦,没有循环:)

编辑v_outer我的qdapTools包中有一个功能,使这项任务非常简单:

library(qdapTools)
(out <- v_outer(mtcars, function(x, y) cor.test(x, y)[["p.value"]]))
print(out, digits=4)  # for more digits

Run Code Online (Sandbox Code Playgroud)

Answer 3

Del*_*eet 5

可能最简单的方法是使用rcorr()from Hmisc.它只需要一个矩阵,因此rcorr(as.matrix(x))如果您的数据位于data.frame中,请使用它.它将返回一个列表,其中包括:1)r成对的矩阵,2)成对n的矩阵,3)r的p值矩阵.它会自动忽略丢失的数据.

理想情况下,此类函数也应采用data.frames,并根据" 新统计 " 输出置信区间.

Answer 4

小智 5

公认的解决方案（psych 包中的 corr.test 函数）可以工作，但对于大型矩阵来说速度非常慢。我正在处理与药物敏感性矩阵（~1,000 x ~500）相关的基因表达矩阵（~20,000 x ~1,000），我不得不停止它，因为它需要很长时间。

我从 psych 包中获取了一些代码并直接使用 cor() 函数，并得到了更好的结果：

# find (pairwise complete) correlation matrix between two matrices x and y
# compare to corr.test(x, y, adjust = "none")
n <- t(!is.na(x)) %*% (!is.na(y)) # same as count.pairwise(x,y) from psych package
r <- cor(x, y, use = "pairwise.complete.obs") # MUCH MUCH faster than corr.test()
cor2pvalue = function(r, n) {
  t <- (r*sqrt(n-2))/sqrt(1-r^2)
  p <- 2*(1 - pt(abs(t),(n-2)))
  se <- sqrt((1-r*r)/(n-2))
  out <- list(r, n, t, p, se)
  names(out) <- c("r", "n", "t", "p", "se")
  return(out)
}
# get a list with matrices of correlation, pvalues, standard error, etc.
result = cor2pvalue(r,n)

Run Code Online (Sandbox Code Playgroud)

即使有两个 100 x 200 矩阵，差异也是惊人的。一两秒与 45 秒。

> system.time(test_func(x,y))
   user  system elapsed 
  0.308   2.452   0.130 
> system.time(corr.test(x,y, adjust = "none"))
   user  system elapsed 
 45.004   3.276  45.814

Run Code Online (Sandbox Code Playgroud)

归档时间：	13 年，3 月前
查看次数：	32735 次
最近记录：	7 年，5 月前