为什么numpy.array这么慢?

Ste*_*ini 19 python performance numpy

我为此感到困惑

def main():
    for i in xrange(2560000):
        a = [0.0, 0.0, 0.0]

main()

$ time python test.py

real     0m0.793s
Run Code Online (Sandbox Code Playgroud)

现在让我们看看numpy:

import numpy

def main():
    for i in xrange(2560000):
        a = numpy.array([0.0, 0.0, 0.0])

main()

$ time python test.py

real    0m39.338s
Run Code Online (Sandbox Code Playgroud)

神圣的CPU蝙蝠侠!

使用numpy.zeros(3)改进,但仍然不够恕我直言

$ time python test.py

real    0m5.610s
user    0m5.449s
sys 0m0.070s
Run Code Online (Sandbox Code Playgroud)

numpy.version.version ='1.5.1'

如果您想知道在第一个示例中是否跳过列表创建以进行优化,则不是:

  5          19 LOAD_CONST               2 (0.0)
             22 LOAD_CONST               2 (0.0)
             25 LOAD_CONST               2 (0.0)
             28 BUILD_LIST               3
             31 STORE_FAST               1 (a)
Run Code Online (Sandbox Code Playgroud)

Dun*_*nes 37

Numpy针对大量数据进行了优化.给它一个很小的3长度阵列,毫不奇怪,它表现不佳.

考虑单独的测试

import timeit

reps = 100

pythonTest = timeit.Timer('a = [0.] * 1000000')
numpyTest = timeit.Timer('a = numpy.zeros(1000000)', setup='import numpy')
uninitialised = timeit.Timer('a = numpy.empty(1000000)', setup='import numpy')
# empty simply allocates the memory. Thus the initial contents of the array 
# is random noise

print 'python list:', pythonTest.timeit(reps), 'seconds'
print 'numpy array:', numpyTest.timeit(reps), 'seconds'
print 'uninitialised array:', uninitialised.timeit(reps), 'seconds'
Run Code Online (Sandbox Code Playgroud)

输出是

python list: 1.22042918205 seconds
numpy array: 1.05412316322 seconds
uninitialised array: 0.0016028881073 seconds
Run Code Online (Sandbox Code Playgroud)

似乎是数组的归零正在花费所有时间进行numpy.因此,除非您需要初始化数组,否则请尝试使用空.