Cython sum v/s意味着内存跳跃

Ric*_*ham 20 python numpy cython

我一直在尝试使用Cython,我遇到了以下特殊情况:数组上的sum函数占用数组平均值的3倍.

这是我的三个功能

cpdef FLOAT_t cython_sum(cnp.ndarray[FLOAT_t, ndim=1] A):
   cdef double [:] x = A
   cdef double sum = 0
   cdef unsigned int N = A.shape[0]
   for i in xrange(N):
     sum += x[i]
   return sum

cpdef FLOAT_t cython_avg(cnp.ndarray[FLOAT_t, ndim=1] A):
   cdef double [:] x = A
   cdef double sum = 0
   cdef unsigned int N = A.shape[0]
   for i in xrange(N):
     sum += x[i]
   return sum/N


cpdef FLOAT_t cython_silly_avg(cnp.ndarray[FLOAT_t, ndim=1] A):
   cdef unsigned int N = A.shape[0]
   return cython_avg(A)*N
Run Code Online (Sandbox Code Playgroud)

以下是ipython中的运行时间

In [7]: A = np.random.random(1000000)


In [8]: %timeit np.sum(A)   
1000 loops, best of 3: 906 us per loop

In [9]: %timeit np.mean(A)
1000 loops, best of 3: 919 us per loop

In [10]: %timeit cython_avg(A)
1000 loops, best of 3: 896 us per loop

In [11]: %timeit cython_sum(A)
100 loops, best of 3: 2.72 ms per loop

In [12]: %timeit cython_silly_avg(A)
1000 loops, best of 3: 862 us per loop
Run Code Online (Sandbox Code Playgroud)

我无法在简单的cython_sum中考虑内存跳转.是因为一些内存分配?因为它们是从0到1的随机数.总和大约是500K.

由于line_profiler不能用于cython,我无法分析我的代码.

rod*_*gob 1

看来@nbren12 的结果是明确的答案:这些结果无法重现

证据(和逻辑)表明这两种方法具有相同的运行时间。