如何在Python/Numpy中向量化一个简单的for循环

Hyp*_*ube 6 python numpy vectorization scipy

我发现了几十个如何在Python/NumPy中对for循环进行矢量化的例子.不幸的是,我不知道如何使用矢量化形式减少简单for循环的计算时间.在这种情况下甚至可能吗?

time = np.zeros(185000)
lat1 = np.array(([48.78,47.45],[38.56,39.53],...)) # ~ 200000 rows
lat2 = np.array(([7.78,5.45],[7.56,5.53],...)) # same number of rows as time
for ii in np.arange(len(time)):
    pos = np.argwhere( (lat1[:,0]==lat2[ii,0]) and \
                       (lat1[:,1]==lat2[ii,1]) )
    if pos.size:
        pos = int(pos)
        time[ii] = dtime[pos]
Run Code Online (Sandbox Code Playgroud)

Ale*_*x I 3

找到所有匹配项的最快方法可能是对两个数组进行排序并一起遍历它们,就像这个工作示例一样:

import numpy as np

def is_less(a, b):
    # this ugliness is needed because we want to compare lexicographically same as np.lexsort(), from the last column backward
    for i in range(len(a)-1, -1, -1):
        if a[i]<b[i]: return True
        elif a[i]>b[i]: return False
    return False

def is_equal(a, b):
    for i in range(len(a)):
        if a[i] != b[i]: return False
    return True

# lat1 = np.array(([48.78,47.45],[38.56,39.53]))
# lat2 = np.array(([7.78,5.45],[48.78,47.45],[7.56,5.53]))
lat1 = np.load('arr.npy')
lat2 = np.load('refarr.npy')

idx1 = np.lexsort( lat1.transpose() )
idx2 = np.lexsort( lat2.transpose() )
ii = 0
jj = 0
while ii < len(idx1) and jj < len(idx2):
    a = lat1[ idx1[ii] , : ]
    b = lat2[ idx2[jj] , : ]
    if is_equal( a, b ):
        # do stuff with match
        print "match found: lat1=%s lat2=%s %d and %d" % ( repr(a), repr(b), idx1[ii], idx2[jj] )
        ii += 1
        jj += 1
    elif is_less( a, b ):
        ii += 1
    else:
        jj += 1
Run Code Online (Sandbox Code Playgroud)

这可能不完全是Python式的(也许有人可以想到使用生成器或itertools的更好的实现?),但很难想象有任何依赖于一次搜索一个点的方法在速度上击败它。