Aug*_*sta 6 python arrays numpy pattern-matching
我有一个大NumPy.array field_array而小的数组match_array,都由int值组成.使用以下示例,如何检查match_array形状的任何段是否field_array包含与其中的值完全对应的值match_array?
import numpy
raw_field = ( 24, 25, 26, 27, 28, 29, 30, 31, 23, \
33, 34, 35, 36, 37, 38, 39, 40, 32, \
-39, -38, -37, -36, -35, -34, -33, -32, -40, \
-30, -29, -28, -27, -26, -25, -24, -23, -31, \
-21, -20, -19, -18, -17, -16, -15, -14, -22, \
-12, -11, -10, -9, -8, -7, -6, -5, -13, \
-3, -2, -1, 0, 1, 2, 3, 4, -4, \
6, 7, 8, 4, 5, 6, 7, 13, 5, \
15, 16, 17, 8, 9, 10, 11, 22, 14)
field_array = numpy.array(raw_field, int).reshape(9,9)
match_array = numpy.arange(12).reshape(3,4)
Run Code Online (Sandbox Code Playgroud)
这些例子应该回归,True因为所描述的模式是match_array对齐的[6:9,3:7].
Div*_*kar 10
方法#1
这种方法从派生a solution到Implement Matlab's im2col 'sliding' in python这是旨在rearrange sliding blocks from a 2D array into columns.因此,为了解决我们的情况,那些滑动块field_array可以堆叠为列,并与列矢量版本进行比较match_array.
这是重新排列/堆叠功能的正式定义 -
def im2col(A,BLKSZ):
# Parameters
M,N = A.shape
col_extent = N - BLKSZ[1] + 1
row_extent = M - BLKSZ[0] + 1
# Get Starting block indices
start_idx = np.arange(BLKSZ[0])[:,None]*N + np.arange(BLKSZ[1])
# Get offsetted indices across the height and width of input array
offset_idx = np.arange(row_extent)[:,None]*N + np.arange(col_extent)
# Get all actual indices & index into input array for final output
return np.take (A,start_idx.ravel()[:,None] + offset_idx.ravel())
Run Code Online (Sandbox Code Playgroud)
为了解决我们的情况,这里的实现基于im2col-
# Get sliding blocks of shape same as match_array from field_array into columns
# Then, compare them with a column vector version of match array.
col_match = im2col(field_array,match_array.shape) == match_array.ravel()[:,None]
# Shape of output array that has field_array compared against a sliding match_array
out_shape = np.asarray(field_array.shape) - np.asarray(match_array.shape) + 1
# Now, see if all elements in a column are ONES and reshape to out_shape.
# Finally, find the position of TRUE indices
R,C = np.where(col_match.all(0).reshape(out_shape))
Run Code Online (Sandbox Code Playgroud)
问题中给定样本的输出将是 -
In [151]: R,C
Out[151]: (array([6]), array([3]))
Run Code Online (Sandbox Code Playgroud)
方法#2
鉴于opencv已经具有模板匹配功能,可以实现差异,您可以使用它并查找零差异,这将是您的匹配位置.所以,如果你有权访问cv2(opencv模块),那么实现看起来像这样 -
import cv2
from cv2 import matchTemplate as cv2m
M = cv2m(field_array.astype('uint8'),match_array.astype('uint8'),cv2.TM_SQDIFF)
R,C = np.where(M==0)
Run Code Online (Sandbox Code Playgroud)
给我们 -
In [204]: R,C
Out[204]: (array([6]), array([3]))
Run Code Online (Sandbox Code Playgroud)
本节比较了解决问题所建议的所有方法的运行时间.本节中列出的各种方法的功劳归于其贡献者.
方法定义 -
def seek_array(search_in, search_for, return_coords = False):
si_x, si_y = search_in.shape
sf_x, sf_y = search_for.shape
for y in xrange(si_y-sf_y+1):
for x in xrange(si_x-sf_x+1):
if numpy.array_equal(search_for, search_in[x:x+sf_x, y:y+sf_y]):
return (x,y) if return_coords else True
return None if return_coords else False
def skimage_based(field_array,match_array):
windows = view_as_windows(field_array, match_array.shape)
return (windows == match_array).all(axis=(2,3)).nonzero()
def im2col_based(field_array,match_array):
col_match = im2col(field_array,match_array.shape)==match_array.ravel()[:,None]
out_shape = np.asarray(field_array.shape) - np.asarray(match_array.shape) + 1
return np.where(col_match.all(0).reshape(out_shape))
def cv2_based(field_array,match_array):
M = cv2m(field_array.astype('uint8'),match_array.astype('uint8'),cv2.TM_SQDIFF)
return np.where(M==0)
Run Code Online (Sandbox Code Playgroud)
运行时测试 -
案例#1(问题的样本数据):
In [11]: field_array
Out[11]:
array([[ 24, 25, 26, 27, 28, 29, 30, 31, 23],
[ 33, 34, 35, 36, 37, 38, 39, 40, 32],
[-39, -38, -37, -36, -35, -34, -33, -32, -40],
[-30, -29, -28, -27, -26, -25, -24, -23, -31],
[-21, -20, -19, -18, -17, -16, -15, -14, -22],
[-12, -11, -10, -9, -8, -7, -6, -5, -13],
[ -3, -2, -1, 0, 1, 2, 3, 4, -4],
[ 6, 7, 8, 4, 5, 6, 7, 13, 5],
[ 15, 16, 17, 8, 9, 10, 11, 22, 14]])
In [12]: match_array
Out[12]:
array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11]])
In [13]: %timeit seek_array(field_array, match_array, return_coords = False)
1000 loops, best of 3: 465 µs per loop
In [14]: %timeit skimage_based(field_array,match_array)
10000 loops, best of 3: 97.9 µs per loop
In [15]: %timeit im2col_based(field_array,match_array)
10000 loops, best of 3: 74.3 µs per loop
In [16]: %timeit cv2_based(field_array,match_array)
10000 loops, best of 3: 30 µs per loop
Run Code Online (Sandbox Code Playgroud)
案例#2(更大的随机数据):
In [17]: field_array = np.random.randint(0,4,(256,256))
In [18]: match_array = field_array[100:116,100:116].copy()
In [19]: %timeit seek_array(field_array, match_array, return_coords = False)
1 loops, best of 3: 400 ms per loop
In [20]: %timeit skimage_based(field_array,match_array)
10 loops, best of 3: 54.3 ms per loop
In [21]: %timeit im2col_based(field_array,match_array)
10 loops, best of 3: 125 ms per loop
In [22]: %timeit cv2_based(field_array,match_array)
100 loops, best of 3: 4.08 ms per loop
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
2813 次 |
| 最近记录: |