如何计算numpy数组的行对之间的欧氏距离

Ras*_*ngh 4 python numpy euclidean-distance

我有一个numpy像这样的数组:

import numpy as np
a = np.array([[1,0,1,0],
             [1,1,0,0],
             [1,0,1,0],
             [0,0,1,1]])
Run Code Online (Sandbox Code Playgroud)

我想euclidian distance在每对行之间进行计算.

from scipy.spatial import distance
for i in range(0,a.shape[0]):
    d = [np.sqrt(np.sum((a[i]-a[j])**2)) for j in range(i+1,a.shape[0])]
    print(d)
Run Code Online (Sandbox Code Playgroud)

[1.4142135623730951,0.0,1.4142135623730951]

[1.4142135623730951,2.0]

[1.4142135623730951]

[]

有没有更好的pythonic方法来做到这一点,因为我必须在一个巨大的numpy阵列上运行此代码?

com*_*iro 10

对于更"优雅"的东西,你总是可以使用scikitlearn成对的欧几里德距离:

from sklearn.metrics.pairwise import euclidean_distances
euclidean_distances(a,a)
Run Code Online (Sandbox Code Playgroud)

具有与单个阵列相同的输出.

array([[ 0.        ,  1.41421356,  0.        ,  1.41421356],
       [ 1.41421356,  0.        ,  1.41421356,  2.        ],
       [ 0.        ,  1.41421356,  0.        ,  1.41421356],
       [ 1.41421356,  2.        ,  1.41421356,  0.        ]])
Run Code Online (Sandbox Code Playgroud)


NaN*_*NaN 6

并且为了完整性,通常参考einsum进行距离计算.

a = np.array([[1,0,1,0],
         [1,1,0,0],
         [1,0,1,0],
         [0,0,1,1]])

b = a.reshape(a.shape[0], 1, a.shape[1])

np.sqrt(np.einsum('ijk, ijk->ij', a-b, a-b))

array([[ 0.        ,  1.41421356,  0.        ,  1.41421356],
       [ 1.41421356,  0.        ,  1.41421356,  2.        ],
       [ 0.        ,  1.41421356,  0.        ,  1.41421356],
       [ 1.41421356,  2.        ,  1.41421356,  0.        ]])
Run Code Online (Sandbox Code Playgroud)