cru*_*rky 16 python numpy cython
我正在对http://docs.cython.org/src/tutorial/numpy.html上的素数生成器的变体进行一些性能测试.以下性能测量值为kmax = 1000
纯Python实现,在CPython中运行:0.15s
纯Python实现,在Cython中运行:0.07s
def primes(kmax):
p = []
k = 0
n = 2
while k < kmax:
i = 0
while i < k and n % p[i] != 0:
i = i + 1
if i == k:
p.append(n)
k = k + 1
n = n + 1
return p
Run Code Online (Sandbox Code Playgroud)
纯Python + Numpy实现,在CPython中运行:1.25s
import numpy
def primes(kmax):
p = numpy.empty(kmax, dtype=int)
k = 0
n = 2
while k < kmax:
i = 0
while i < k and n % p[i] != 0:
i = i + 1
if i == k:
p[k] = n
k = k + 1
n = n + 1
return p
Run Code Online (Sandbox Code Playgroud)
使用int*:0.003s的Cython实现
from libc.stdlib cimport malloc, free
def primes(int kmax):
cdef int n, k, i
cdef int *p = <int *>malloc(kmax * sizeof(int))
result = []
k = 0
n = 2
while k < kmax:
i = 0
while i < k and n % p[i] != 0:
i = i + 1
if i == k:
p[k] = n
k = k + 1
result.append(n)
n = n + 1
free(p)
return result
Run Code Online (Sandbox Code Playgroud)
以上表现很好,但看起来很可怕,因为它拥有两份数据...所以我尝试重新实现它:
Cython + Numpy:1.01s
import numpy as np
cimport numpy as np
cimport cython
DTYPE = np.int
ctypedef np.int_t DTYPE_t
@cython.boundscheck(False)
def primes(DTYPE_t kmax):
cdef DTYPE_t n, k, i
cdef np.ndarray p = np.empty(kmax, dtype=DTYPE)
k = 0
n = 2
while k < kmax:
i = 0
while i < k and n % p[i] != 0:
i = i + 1
if i == k:
p[k] = n
k = k + 1
n = n + 1
return p
Run Code Online (Sandbox Code Playgroud)
问题:
如何将numpy数组转换为int*?以下不起作用
cdef numpy.nparray a = numpy.zeros(100, dtype=int)
cdef int * p = <int *>a.data
Run Code Online (Sandbox Code Playgroud)cdef DTYPE_t [:] p_view = p
Run Code Online (Sandbox Code Playgroud)
在计算中使用此代替p.我将运行时间从580毫秒减少到2.8毫秒.关于与使用*int的实现完全相同的运行时.这就是你可以期待的最大值.
DTYPE = np.int
ctypedef np.int_t DTYPE_t
@cython.boundscheck(False)
def primes(DTYPE_t kmax):
cdef DTYPE_t n, k, i
cdef np.ndarray p = np.empty(kmax, dtype=DTYPE)
cdef DTYPE_t [:] p_view = p
k = 0
n = 2
while k < kmax:
i = 0
while i < k and n % p_view[i] != 0:
i = i + 1
if i == k:
p_view[k] = n
k = k + 1
n = n + 1
return p
Run Code Online (Sandbox Code Playgroud)
为什么在CPython上运行时numpy数组比python列表慢得多?
因为你没有完全输入它.使用
cdef np.ndarray[dtype=np.int, ndim=1] p = np.empty(kmax, dtype=DTYPE)
Run Code Online (Sandbox Code Playgroud)
如何将numpy数组转换为int*?
通过使用np.intc作为dtype,不是np.int(这是C long).那是
cdef np.ndarray[dtype=int, ndim=1] p = np.empty(kmax, dtype=np.intc)
Run Code Online (Sandbox Code Playgroud)
(但实际上,使用memoryview,它们更干净,从长远来看,Cython人们想要摆脱NumPy数组语法.)
迄今为止我发现的最佳语法:
import numpy
cimport numpy
cimport cython
@cython.boundscheck(False)
@cython.wraparound(False)
def primes(int kmax):
cdef int n, k, i
cdef numpy.ndarray[int] p = numpy.empty(kmax, dtype=numpy.int32)
k = 0
n = 2
while k < kmax:
i = 0
while i < k and n % p[i] != 0:
i = i + 1
if i == k:
p[k] = n
k = k + 1
n = n + 1
return p
Run Code Online (Sandbox Code Playgroud)
请注意我在哪里使用了 numpy.int32 而不是 int。cdef 左侧的任何内容都是 C 类型(因此 int = int32 和 float = float32),而其右侧(或 cdef 之外)的任何内容都是 python 类型(int = int64 和 float = float64 )