对于大型数组而言比列表慢得多?

thk*_*ang 0 python numpy

检查我的以下代码; 它是在python中实现的sigma_2函数(使用原始筛选)的一部分,它是除数函数之一http://mathworld.wolfram.com/DivisorFunction.html

from time import time
from itertools import count
import numpy

def sig2(N, nump=False):
    init = time()


    #initialize array with value=1 since every positive integer is divisible by 1
    if nump:
        print 'using numpy'
        nums = numpy.ones((N,), dtype=numpy.int64)
    else:        
        nums = [1 for i in xrange(1, N)]

    #for each number n < N, add n*n to n's multiples
    for n in xrange(2, N):
        nn = n*n
        for i in count(1):
            if n*i >= N: break
            nums[n*i-1] += nn

    print 'sig2(n) done - {} ms'.format((time()-init)*1000)
Run Code Online (Sandbox Code Playgroud)

我尝试了不同的价值观和numpy,这是非常令人失望的.

2000年:

sig2(n) done - 4.85897064209 ms
took : 33.7610244751 ms
using numpy
sig2(n) done - 31.5930843353 ms
took : 55.6900501251 ms
Run Code Online (Sandbox Code Playgroud)

为200000:

sig2(n) done - 1113.80600929 ms
took : 1272.8869915 ms
using numpy
sig2(n) done - 4469.48194504 ms
took : 4705.97100258 ms
Run Code Online (Sandbox Code Playgroud)

它继续,我的代码不是真正的可扩展 - 因为它不是O(n),但有这两个,除了这两个结果使用numpy导致性能问题.不应该比python列表和dicts更快?这是我对numpy的印象.

bte*_*tel 5

As @unutbu said numpy really shines when you use vectorised operations. Here is an optimised implementation using numpy (it is consistent with the definition of divisor function from Mathworld):

import numpy as np

def sig2_numpy(N):

    x = np.arange(1,N+1)
    x[(N % x) != 0] = 0
    return np.sum(x**2)
Run Code Online (Sandbox Code Playgroud)

When you call it, it is much faster:

>> import time
>> init = time.time()
>> print sig2_numpy(20000)
>> print "It took", (time.time()-init)*1000., 'ms'
It took 0.916957855225 ms
Run Code Online (Sandbox Code Playgroud)

  • 哦顺便说一下,`np.sum(x**2)== np.dot(x,x)`.后者可能更快,因为它不会产生中间的`x**2`数组. (2认同)