为什么numpy list访问速度比vanilla python慢​​?

Cap*_*man 5 python performance numpy

我的印象是numpy对于列表操作会更快,但下面的例子似乎表明不是这样:

import numpy as np
import time

def ver1():
    a = [i for i in range(40)]
    b = [0 for i in range(40)]
    for i in range(1000000):
        for j in range(40):
            b[j]=a[j]

def ver2():
    a = np.array([i for i in range(40)])
    b = np.array([0 for i in range(40)])
    for i in range(1000000):
        for j in range(40):
            b[j]=a[j]

t0 = time.time()
ver1()
t1 = time.time()
ver2()
t2 = time.time()

print(t1-t0)
print(t2-t1)
Run Code Online (Sandbox Code Playgroud)

输出是:

4.872278928756714
9.120521068572998
Run Code Online (Sandbox Code Playgroud)

(我在Windows 7中运行64位Python 3.4.3,在i7 920上运行)

我知道这不是复制列表的最快方法,但我试图找出我是否正在使用numpy错误.或者是这种操作的numpy速度较慢,而且在更复杂的操作中效率更高?

编辑:

我也试过以下,只是通过b [:] = a直接复制,而numpy仍然是两倍慢:

import numpy as np
import time

def ver6():
    a = [0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]
    b = [0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]
    for i in range(1000000):
        b[:] = a

def ver7():
    a = np.array([0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0])
    b = np.array([0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0])
    for i in range(1000000):
        b[:] = a

t0 = time.time()
ver6()
t1 = time.time()
ver7()
t2 = time.time()

print(t1-t0)
print(t2-t1)
Run Code Online (Sandbox Code Playgroud)

输出是:

0.36202096939086914
0.6750380992889404
Run Code Online (Sandbox Code Playgroud)

use*_*ica 6

你使用NumPy错了.NumPy的效率依赖于在C级循环中尽可能多地完成工作而不是解释代码.当你这样做

for j in range(40):
    b[j]=a[j]
Run Code Online (Sandbox Code Playgroud)

这是一个解释循环,具有所有内在解释器开销和更多,因为NumPy的索引逻辑比列表索引更复杂,并且NumPy需要在每个元素检索上创建一个新的元素包装器对象.当您编写这样的代码时,您无法获得NumPy的任何好处.

你需要编写代码,以便在C中完成工作:

b[:] = a
Run Code Online (Sandbox Code Playgroud)

这也可以提高列表操作的效率,但对NumPy来说更重要.

  • @L3viathan:那没用; 事实上,这是完全错误的.实际上,数组应该是`np.arange(40)`和`numpy.zeros([40])`. (3认同)
  • @CaptainCodeman:这是由3个因素组合而成:输入相当小,涉及的分配很少,而且Python列表也可以将工作推送到C中.如果你尝试使用更大的数组,或者你尝试数学运算(比如元素加法),那么NumPy数组会更快. (2认同)