为什么numpy比python慢​​?如何使代码表现更好

use*_*836 7 python performance numpy

我将我的神经网络从纯粹的python改为numpy,但现在它的工作速度更慢了.所以我尝试了这两个功能:

def d():
    a = [1,2,3,4,5]
    b = [10,20,30,40,50]
    c = [i*j for i,j in zip(a,b)]
    return c

def e():
    a = np.array([1,2,3,4,5])
    b = np.array([10,20,30,40,50])
    c = a*b
    return c
Run Code Online (Sandbox Code Playgroud)

timeit d = 1.77135205057

timeit e = 17.2464673758

Numpy慢10倍.为什么会如此以及如何正确使用numpy?

mgi*_*son 14

我认为差异是因为你正在构建列表和数组,e而你只是在构建列表d.考虑:

import numpy as np

def d():
    a = [1,2,3,4,5]
    b = [10,20,30,40,50]
    c = [i*j for i,j in zip(a,b)]
    return c

def e():
    a = np.array([1,2,3,4,5])
    b = np.array([10,20,30,40,50])
    c = a*b
    return c

#Warning:  Functions with mutable default arguments are below.
# This code is only for testing and would be bad practice in production!
def f(a=[1,2,3,4,5],b=[10,20,30,40,50]):
    c = [i*j for i,j in zip(a,b)]
    return c

def g(a=np.array([1,2,3,4,5]),b=np.array([10,20,30,40,50])):
    c = a*b
    return c


import timeit
print timeit.timeit('d()','from __main__ import d')
print timeit.timeit('e()','from __main__ import e')
print timeit.timeit('f()','from __main__ import f')
print timeit.timeit('g()','from __main__ import g')
Run Code Online (Sandbox Code Playgroud)

这里的函数fg避免每次重新创建列表/数组,我们得到非常相似的性能:

1.53083586693
15.8963699341
1.33564996719
1.69556999207
Run Code Online (Sandbox Code Playgroud)

请注意,list-comp + zip仍然获胜.但是,如果我们使阵列足够大,那么numpy会失败:

t1 = [1,2,3,4,5] * 100
t2 = [10,20,30,40,50] * 100
t3 = np.array(t1)
t4 = np.array(t2)
print timeit.timeit('f(t1,t2)','from __main__ import f,t1,t2',number=10000)
print timeit.timeit('g(t3,t4)','from __main__ import g,t3,t4',number=10000)
Run Code Online (Sandbox Code Playgroud)

我的结果是:

0.602419137955
0.0263929367065
Run Code Online (Sandbox Code Playgroud)