在多对列上提取和格式化 cor.test 结果

Bio*_*ram 1 statistics r correlation

我正在尝试生成相关矩阵的表输出。具体来说,我使用 for 循环来识别第 4:40 列与第 1 列中的所有数据之间的相关性。虽然该表的结果不错,但它无法识别正在与什么进行比较。在检查 的属性时cor.test,我发现 data.name 被指定为x[1]y[1],这不足以追踪哪些列正在与哪些列进行比较。这是我的代码:

input <- read.delim(file="InputData.txt", header=TRUE)
x<-input[,41, drop=FALSE]
y=input[,4:40]
corr.values <- vector("list", 37)
for (i in 1:length(y) ){
  corr.values[[i]] <- cor.test(x[[1]], y[[i]], method="pearson")
}
lres <- sapply(corr.values, `[`, c("statistic","p.value","estimate","method", "data.name"))
lres<-t(lres)
write.table(lres, file="output.xls", sep="\t",row.names=TRUE)
Run Code Online (Sandbox Code Playgroud)

输出文件如下所示:

       statistic        p.value     estimate                                  method            data.name   
1   -2.030111981    0.042938137 -0.095687495    Pearson's product-moment correlation    x[[1]] and y[[i]]
2   -2.795786248    0.005400938 -0.131239287    Pearson's product-moment correlation    x[[1]] and y[[i]]
3   -2.099114632    0.036368337 -0.098908573    Pearson's product-moment correlation    x[[1]] and y[[i]]
4   -1.920649487    0.055413178 -0.090571599    Pearson's product-moment correlation    x[[1]] and y[[i]]
5   -1.981326962    0.048168291 -0.093408365    Pearson's product-moment correlation    x[[1]] and y[[i]]
6   -2.80390736      0.00526909 -0.131613912    Pearson's product-moment correlation    x[[1]] and y[[i]]
7   -1.265138839    0.206482153 -0.059798855    Pearson's product-moment correlation    x[[1]] and y[[i]]
8   -2.861448156    0.004415411 -0.134266636    Pearson's product-moment correlation    x[[1]] and y[[i]]
9   -2.103403363    0.035990039 -0.099108672    Pearson's product-moment correlation    x[[1]] and y[[i]]
10  -3.610094985    0.000340807 -0.168498786    Pearson's product-moment correlation    x[[1]] and y[[i]]
Run Code Online (Sandbox Code Playgroud)

显然,这并不完美,因为行已编号并且无法区分哪些相关性。有没有办法来解决这个问题?我尝试了很多解决方案,但没有一个有效。我知道诀窍一定在于编辑属性data.name,但我不知道如何做到这一点。

eip*_*i10 5

以下是返回包含所有结果的数据框的方法cor.test,其中还包括计算每个相关性的变量的名称:我们创建一个函数来提取相关结果,cor.test然后使用mapply该函数将函数应用于计算每个相关性的每对变量我们想要相关性。mapply返回一个列表,因此我们将do.call(rbind, ...)其转换为数据框。

# Function to extract correlation coefficient and p-values
corrFunc <- function(var1, var2, data) {
  result = cor.test(data[,var1], data[,var2])
  data.frame(var1, var2, result[c("estimate","p.value","statistic","method")], 
             stringsAsFactors=FALSE)
}

## Pairs of variables for which we want correlations
vars = data.frame(v1=names(mtcars)[1], v2=names(mtcars)[-1])

# Apply corrFunc to all rows of vars
corrs = do.call(rbind, mapply(corrFunc, vars[,1], vars[,2], MoreArgs=list(data=mtcars), 
                              SIMPLIFY=FALSE))

     var1 var2   estimate      p.value statistic                               method
cor   mpg  cyl -0.8475514 9.380327e-10 -8.747152 Pearson's product-moment correlation
cor1  mpg disp -0.7761684 1.787835e-07 -6.742389 Pearson's product-moment correlation
cor2  mpg   hp  0.4186840 1.708199e-02  2.525213 Pearson's product-moment correlation
cor3  mpg drat  0.6811719 1.776240e-05  5.096042 Pearson's product-moment correlation
cor4  mpg   wt  0.4802848 5.400948e-03  2.999191 Pearson's product-moment correlation
cor5  mpg qsec  0.6640389 3.415937e-05  4.864385 Pearson's product-moment correlation
cor6  mpg   vs  0.5998324 2.850207e-04  4.106127 Pearson's product-moment correlation
cor7  mpg   am  1.0000000 0.000000e+00       Inf Pearson's product-moment correlation
cor8  mpg gear -0.8676594 1.293959e-10 -9.559044 Pearson's product-moment correlation
cor9  mpg carb -0.8521620 6.112687e-10 -8.919699 Pearson's product-moment correlation
Run Code Online (Sandbox Code Playgroud)