Kam*_*biz 35 python optimization for-loop numpy vectorization
我很感激在寻找和理解pythonic方法的一些帮助,以优化嵌套for循环中的以下数组操作:
def _func(a, b, radius):
"Return 0 if a>b, otherwise return 1"
if distance.euclidean(a, b) < radius:
return 1
else:
return 0
def _make_mask(volume, roi, radius):
mask = numpy.zeros(volume.shape)
for x in range(volume.shape[0]):
for y in range(volume.shape[1]):
for z in range(volume.shape[2]):
mask[x, y, z] = _func((x, y, z), roi, radius)
return mask
Run Code Online (Sandbox Code Playgroud)
其中volume.shape(182,218,200)和roi.shape(3,)都是ndarray类型; 而且radius是一个int
Div*_*kar 67
方法#1
这是一个矢量化的方法 -
m,n,r = volume.shape
x,y,z = np.mgrid[0:m,0:n,0:r]
X = x - roi[0]
Y = y - roi[1]
Z = z - roi[2]
mask = X**2 + Y**2 + Z**2 < radius**2
Run Code Online (Sandbox Code Playgroud)
可能的改进:我们可以用numexpr模块加速最后一步-
import numexpr as ne
mask = ne.evaluate('X**2 + Y**2 + Z**2 < radius**2')
Run Code Online (Sandbox Code Playgroud)
方法#2
我们还可以逐渐构建与形状参数相对应的三个范围,并针对动态的三个元素执行减法,roi而无需实际创建网格,如前所述np.mgrid.broadcasting出于效率目的,这将有益于使用.实现看起来像这样 -
m,n,r = volume.shape
vals = ((np.arange(m)-roi[0])**2)[:,None,None] + \
((np.arange(n)-roi[1])**2)[:,None] + ((np.arange(r)-roi[2])**2)
mask = vals < radius**2
Run Code Online (Sandbox Code Playgroud)
简化版本:感谢@Bi Rico建议改进,因为我们可以用np.ogrid更简洁的方式执行这些操作,就像这样 -
m,n,r = volume.shape
x,y,z = np.ogrid[0:m,0:n,0:r]-roi
mask = (x**2+y**2+z**2) < radius**2
Run Code Online (Sandbox Code Playgroud)
运行时测试
功能定义 -
def vectorized_app1(volume, roi, radius):
m,n,r = volume.shape
x,y,z = np.mgrid[0:m,0:n,0:r]
X = x - roi[0]
Y = y - roi[1]
Z = z - roi[2]
return X**2 + Y**2 + Z**2 < radius**2
def vectorized_app1_improved(volume, roi, radius):
m,n,r = volume.shape
x,y,z = np.mgrid[0:m,0:n,0:r]
X = x - roi[0]
Y = y - roi[1]
Z = z - roi[2]
return ne.evaluate('X**2 + Y**2 + Z**2 < radius**2')
def vectorized_app2(volume, roi, radius):
m,n,r = volume.shape
vals = ((np.arange(m)-roi[0])**2)[:,None,None] + \
((np.arange(n)-roi[1])**2)[:,None] + ((np.arange(r)-roi[2])**2)
return vals < radius**2
def vectorized_app2_simplified(volume, roi, radius):
m,n,r = volume.shape
x,y,z = np.ogrid[0:m,0:n,0:r]-roi
return (x**2+y**2+z**2) < radius**2
Run Code Online (Sandbox Code Playgroud)
计时 -
In [106]: # Setup input arrays
...: volume = np.random.rand(90,110,100) # Half of original input sizes
...: roi = np.random.rand(3)
...: radius = 3.4
...:
In [107]: %timeit _make_mask(volume, roi, radius)
1 loops, best of 3: 41.4 s per loop
In [108]: %timeit vectorized_app1(volume, roi, radius)
10 loops, best of 3: 62.3 ms per loop
In [109]: %timeit vectorized_app1_improved(volume, roi, radius)
10 loops, best of 3: 47 ms per loop
In [110]: %timeit vectorized_app2(volume, roi, radius)
100 loops, best of 3: 4.26 ms per loop
In [139]: %timeit vectorized_app2_simplified(volume, roi, radius)
100 loops, best of 3: 4.36 ms per loop
Run Code Online (Sandbox Code Playgroud)
因此,一如既往地broadcasting展示其10,000x对原始代码的疯狂几乎加速的魔力,并且比10x通过使用即时广播操作创建网格更好!
假设您首先构建一个xyzy数组:
import itertools
xyz = [np.array(p) for p in itertools.product(range(volume.shape[0]), range(volume.shape[1]), range(volume.shape[2]))]
Run Code Online (Sandbox Code Playgroud)
现在,使用numpy.linalg.norm,
np.linalg.norm(xyz - roi, axis=1) < radius
Run Code Online (Sandbox Code Playgroud)
检查每个元组的距离roi是否小于半径.
最后,只reshape需要您需要的尺寸的结果.