高效的python数组到numpy数组转换

Sim*_*got 23 python numpy

我从python标准库得到一个数组格式的大数组(图像为12 Mpix).由于我想对这些数组执行操作,我希望将其转换为numpy数组.我尝试了以下方法:

import numpy
import array
from datetime import datetime
test = array.array('d', [0]*12000000)
t = datetime.now()
numpy.array(test)
print datetime.now() - t
Run Code Online (Sandbox Code Playgroud)

我得到一到两秒之间的结果:相当于python中的循环.

有没有更有效的方法来进行这种转换?

eum*_*iro 50

np.array(test)                                       # 1.19s

np.fromiter(test, dtype=int)                         # 1.08s

np.frombuffer(test)                                  # 459ns !!!
Run Code Online (Sandbox Code Playgroud)

  • 不要忘记为`frombuffer`设置`dtype`. (3认同)
  • dang,我不知道frombuffer!谢谢! (2认同)

Eri*_*ric 5

asarray(x) 几乎总是任何类数组对象的最佳选择。

array并且fromiter很慢,因为它们执行复制。Usingasarray允许忽略此副本:

>>> import array
>>> import numpy as np
>>> test = array.array('d', [0]*12000000)
Run Code Online (Sandbox Code Playgroud)
# very slow - this makes multiple copies that grow each time
>>> %timeit np.fromiter(test, dtype=test.typecode)
626 ms ± 3.97 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

# fast memory copy
>>> %timeit np.array(test)
63.5 ms ± 639 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

# which is equivalent to doing the fast construction followed by a copy
>>> %timeit np.asarray(test).copy()
63.4 ms ± 371 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

# so doing just the construction is way faster
>>> %timeit np.asarray(test)
1.73 µs ± 70.2 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

# marginally faster, but at the expense of verbosity and type safety if you
# get the wrong type
>>> %timeit np.frombuffer(test, dtype=test.typecode)
1.07 µs ± 27.3 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

Run Code Online (Sandbox Code Playgroud)