我进行了一些计算以获得一个 numpy 数组列表。随后,我想找到沿第一个轴的最大值。我目前的实施(见下文)非常缓慢,我想找到替代方案。
原来的
pending = [<list of items>]
matrix = [compute(item) for item in pending if <some condition on item>]
dominant = np.max(matrix, axis = 0)
Run Code Online (Sandbox Code Playgroud)
修订版 1:此实现速度更快(~10 倍;大概是因为 numpy 不需要弄清楚数组的形状)
pending = [<list of items>]
matrix = [compute(item) for item in pending if <some condition on item>]
matrix = np.vstack(matrix)
dominant = np.max(matrix, axis = 0)
Run Code Online (Sandbox Code Playgroud)
我运行了几个测试,速度变慢似乎是由于数组列表到 numpy 数组的内部转换
Timer unit: 1e-06 s
Total time: 1.21389 s
Line # Hits Time Per Hit % Time Line Contents
==============================================================
4 def direct_max(list_of_arrays):
5 1000 1213886 1213.9 100.0 np.max(list_of_arrays, axis = 0)
Total time: 1.20766 s
Line # Hits Time Per Hit % Time Line Contents
==============================================================
8 def numpy_max(list_of_arrays):
9 1000 1151281 1151.3 95.3 list_of_arrays = np.array(list_of_arrays)
10 1000 56384 56.4 4.7 np.max(list_of_arrays, axis = 0)
Total time: 0.15437 s
Line # Hits Time Per Hit % Time Line Contents
==============================================================
12 @profile
13 def stack_max(list_of_arrays):
14 1000 102205 102.2 66.2 list_of_arrays = np.vstack(list_of_arrays)
15 1000 52165 52.2 33.8 np.max(list_of_arrays, axis = 0)
Run Code Online (Sandbox Code Playgroud)
有没有办法加快 max 函数的速度,或者是否可以使用我的计算结果有效地填充 numpy 数组,以便 max 很快?
您可以使用reduce(np.maximum, matrix),这是一个测试:
import numpy as np
np.random.seed(0)
N, M = 1000, 1000
matrix = [np.random.rand(N) for _ in xrange(M)]
%timeit np.max(matrix, axis = 0)
%timeit np.max(np.vstack(matrix), axis = 0)
%timeit reduce(np.maximum, matrix)
Run Code Online (Sandbox Code Playgroud)
结果是:
10 loops, best of 3: 116 ms per loop
10 loops, best of 3: 10.6 ms per loop
100 loops, best of 3: 3.66 ms per loop
Run Code Online (Sandbox Code Playgroud)
编辑
`argmax()' 更难,但你可以使用 for 循环:
def argmax_list(matrix):
m = matrix[0].copy()
idx = np.zeros(len(m), dtype=np.int)
for i, a in enumerate(matrix[1:], 1):
mask = m < a
m[mask] = a[mask]
idx[mask] = i
return idx
Run Code Online (Sandbox Code Playgroud)
它仍然比argmax():
%timeit np.argmax(matrix, axis=0)
%timeit np.argmax(np.vstack(matrix), axis=0)
%timeit argmax_list(matrix)
Run Code Online (Sandbox Code Playgroud)
结果:
10 loops, best of 3: 131 ms per loop
10 loops, best of 3: 21 ms per loop
100 loops, best of 3: 13.1 ms per loop
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
1917 次 |
| 最近记录: |