快速总结numpy数组元素

lnm*_*rer 11 python optimization numpy

假设我想做一个numpy数组列表的元素总和:

tosum = [rand(100,100) for n in range(10)]
Run Code Online (Sandbox Code Playgroud)

我一直在寻找最好的方法来做到这一点.好像numpy.sum很可怕:

timeit.timeit('sum(array(tosum), axis=0)',
              setup='from numpy import sum; from __main__ import tosum, array',
              number=10000)
75.02289700508118
timeit.timeit('sum(tosum, axis=0)',
              setup='from numpy import sum; from __main__ import tosum',
              number=10000)
78.99106407165527
Run Code Online (Sandbox Code Playgroud)

减少速度要快得多(接近两个数量级):

timeit.timeit('reduce(add,tosum)',
              setup='from numpy import add; from __main__ import tosum',
              number=10000)
1.131795883178711
Run Code Online (Sandbox Code Playgroud)

它看起来像甚至在非numpy总和上有一个有意义的领先(请注意,这些是针对1e6运行而不是1e4以上的时间):

timeit.timeit('reduce(add,tosum)',
              setup='from numpy import add; from __main__ import tosum',
              number=1000000)
109.98814797401428

timeit.timeit('sum(tosum)',
              setup='from __main__ import tosum',
              number=1000000)
125.52461504936218
Run Code Online (Sandbox Code Playgroud)

还有其他方法我应该尝试吗?任何人都可以解释排名吗?


编辑

如果列表首先变成numpy数组,numpy.sum肯定会更快:

tosum2 = array(tosum)
timeit.timeit('sum(tosum2, axis=0)',
              setup='from numpy import sum; from __main__ import tosum2',
              number=10000)
1.1545608043670654
Run Code Online (Sandbox Code Playgroud)

但是,我只想做一次总和,所以将数组转换为numpy数组仍然会导致真正的性能损失.

War*_*ser 5

以下内容可与竞争reduce,如果tosum列表足够长,则速度更快。但是,它并没有快很多,而且代码更多。(reduce(add, tosum)肯定很漂亮。)

def loop_inplace_sum(arrlist):
    # assumes len(arrlist) > 0
    sum = arrlist[0].copy()
    for a in arrlist[1:]:
        sum += a
    return sum
Run Code Online (Sandbox Code Playgroud)

时机为原tosumreduce(add, tosum)是比较快的:

In [128]: tosum = [rand(100,100) for n in range(10)]

In [129]: %timeit reduce(add, tosum)
10000 loops, best of 3: 73.5 µs per loop

In [130]: %timeit loop_inplace_sum(tosum)
10000 loops, best of 3: 78 µs per loop
Run Code Online (Sandbox Code Playgroud)

计时更长的数组列表。现在loop_inplace_sum更快了。

In [131]: tosum = [rand(100,100) for n in range(500)]

In [132]: %timeit reduce(add, tosum)
100 loops, best of 3: 5.09 ms per loop

In [133]: %timeit loop_inplace_sum(tosum)
100 loops, best of 3: 4.4 ms per loop
Run Code Online (Sandbox Code Playgroud)