计算两个numpy数组之间的距离

Bor*_*rys 4 python numpy scipy

我对计算两个 numpy 数组(x 和 y)之间的各种空间距离很感兴趣。

http://docs.scipy.org/doc/scipy-0.14.0/reference/generated/scipy.spatial.distance.cdist.html

import numpy as np
from scipy.spatial.distance import cdist

x = np.array([[[1,2,3,4,5],
               [5,6,7,8,5],
               [5,6,7,8,5]],
              [[11,22,23,24,5],
               [25,26,27,28,5],
               [5,6,7,8,5]]])
i,j,k = x.shape

xx = x.reshape(i,j*k).T

y = np.array([[[31,32,33,34,5],
               [35,36,37,38,5],
               [5,6,7,8,5]],
              [[41,42,43,44,5],
               [45,46,47,48,5],
               [5,6,7,8,5]]])

yy = y.reshape(i,j*k).T

results =  cdist(xx,yy,'euclidean')
print results
Run Code Online (Sandbox Code Playgroud)

但是,上述结果会产生太多不需要的结果。如何仅针对我需要的结果限制它。

我想计算 [1,11] 和 [31,41] 之间的距离;[2,22] 和 [32,42] 等等。

Joe*_*ton 8

如果你只想要每对点之间的距离,那么你不需要计算一个完整的距离矩阵。

相反,直接计算它:

import numpy as np

x = np.array([[[1,2,3,4,5],
               [5,6,7,8,5],
               [5,6,7,8,5]],
              [[11,22,23,24,5],
               [25,26,27,28,5],
               [5,6,7,8,5]]])

y = np.array([[[31,32,33,34,5],
               [35,36,37,38,5],
               [5,6,7,8,5]],
              [[41,42,43,44,5],
               [45,46,47,48,5],
               [5,6,7,8,5]]])

xx = x.reshape(2, -1)
yy = y.reshape(2, -1)
dist = np.hypot(*(xx - yy))

print dist
Run Code Online (Sandbox Code Playgroud)

为了更多地解释发生了什么,首先我们重塑数组,使它们具有 2xN 形状(-1是一个占位符,它告诉 numpy 自动计算沿该轴的正确大小):

In [2]: x.reshape(2, -1)
Out[2]: 
array([[ 1,  2,  3,  4,  5,  5,  6,  7,  8,  5,  5,  6,  7,  8,  5],
       [11, 22, 23, 24,  5, 25, 26, 27, 28,  5,  5,  6,  7,  8,  5]])
Run Code Online (Sandbox Code Playgroud)

因此,当我们减去xxand 时yy,我们将得到一个 2xN 数组:

In [3]: xx - yy
Out[3]: 
array([[-30, -30, -30, -30,   0, -30, -30, -30, -30,   0,   0,   0,   0,
          0,   0],
       [-30, -20, -20, -20,   0, -20, -20, -20, -20,   0,   0,   0,   0,
          0,   0]])
Run Code Online (Sandbox Code Playgroud)

然后我们可以将其解压到dxdy组件中:

In [4]: dx, dy = xx - yy

In [5]: dx
Out[5]: 
array([-30, -30, -30, -30,   0, -30, -30, -30, -30,   0,   0,   0,   0,
         0,   0])

In [6]: dy
Out[6]: 
array([-30, -20, -20, -20,   0, -20, -20, -20, -20,   0,   0,   0,   0,
         0,   0])
Run Code Online (Sandbox Code Playgroud)

并计算距离(np.hypot相当于np.sqrt(dx**2 + dy**2)):

In [7]: np.hypot(dx, dy)
Out[7]: 
array([ 42.42640687,  36.05551275,  36.05551275,  36.05551275,
         0.        ,  36.05551275,  36.05551275,  36.05551275,
        36.05551275,   0.        ,   0.        ,   0.        ,
         0.        ,   0.        ,   0.        ])
Run Code Online (Sandbox Code Playgroud)

或者我们可以自动完成解包并一步完成所有操作:

In [8]: np.hypot(*(xx - yy))
Out[8]: 
array([ 42.42640687,  36.05551275,  36.05551275,  36.05551275,
         0.        ,  36.05551275,  36.05551275,  36.05551275,
        36.05551275,   0.        ,   0.        ,   0.        ,
         0.        ,   0.        ,   0.        ])
Run Code Online (Sandbox Code Playgroud)

如果您想计算其他类型的距离,只需更改np.hypot为您要使用的函数即可。例如,对于曼哈顿/城市街区距离:

In [9]: dist = np.sum(np.abs(xx - yy), axis=0)

In [10]: dist
Out[10]: array([60, 50, 50, 50,  0, 50, 50, 50, 50,  0,  0,  0,  0,  0,  0])
Run Code Online (Sandbox Code Playgroud)