如何在python中获得两个向量的相关性

Luk*_*akk 65 python numpy

在matlab我用

a=[1,4,6]
b=[1,2,3]
corr(a,b)
Run Code Online (Sandbox Code Playgroud)

返回.9934.我尝试过,numpy.correlate但它会返回完全不同的东西.获得两个向量的相关性的最简单方法是什么?

Hoo*_*ked 141

文档表明这numpy.correlate不是你想要的:

numpy.correlate(a, v, mode='valid', old_behavior=False)[source]
  Cross-correlation of two 1-dimensional sequences.
  This function computes the correlation as generally defined in signal processing texts:
     z[k] = sum_n a[n] * conj(v[n+k])
  with a and v sequences being zero-padded where necessary and conj being the conjugate.
Run Code Online (Sandbox Code Playgroud)

相反,正如其他评论所暗示的那样,您正在寻找Pearson相关系数.要用scipy尝试这样做:

from scipy.stats.stats import pearsonr   
a = [1,4,6]
b = [1,2,3]   
print pearsonr(a,b)
Run Code Online (Sandbox Code Playgroud)

这给了

(0.99339926779878274, 0.073186395040328034)
Run Code Online (Sandbox Code Playgroud)

您还可以使用numpy.corrcoef:

import numpy
print numpy.corrcoef(a,b)
Run Code Online (Sandbox Code Playgroud)

这给出了:

[[ 1.          0.99339927]
 [ 0.99339927  1.        ]]
Run Code Online (Sandbox Code Playgroud)

  • “pearsonr(a,b)”打印的元组中的第二个值是什么? (3认同)
  • @MuhammadHaseebKhan根据[文档](https://docs.scipy.org/doc/scipy-0.14.0/reference/ generated/scipy.stats.pearsonr.html)它返回的值是(皮尔逊相关系数,2 -尾 p 值) (2认同)