Numpy性能差异取决于数值

Question

Numpy性能差异取决于数值

在Numpy中评估表达式时,我发现了一个奇怪的性能差异.

我执行了以下代码:

import numpy as np
myarr = np.random.uniform(-1,1,[1100,1100])

Run Code Online (Sandbox Code Playgroud)

然后

%timeit np.exp( - 0.5 * (myarr / 0.001)**2 )
>> 184 ms ± 301 µs per loop (mean ± std. dev. of 7 runs, 1 loop each)

Run Code Online (Sandbox Code Playgroud)

和

%timeit np.exp( - 0.5 * (myarr / 0.1)**2 )
>> 12.3 ms ± 34.3 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

Run Code Online (Sandbox Code Playgroud)

在第二种情况下,计算速度提高了近15倍!请注意,唯一的区别是因子为0.1或0.001.

这种行为的原因是什么？我可以更改某些内容以使第一次计算与第二次计算一样快吗？

Answer 1

max*_*111 1

使用英特尔 SVML

我没有使用numexpr过 Intel SVML，但是numexpr使用 SVML 应该表现得和 Numba 一样好。基准测试Numba在不使用 SVML 的情况下表现出完全相同的行为，但使用 SVML 时性能要好得多。

代码

import numpy as np
import numba as nb

myarr = np.random.uniform(-1,1,[1100,1100])

@nb.njit(error_model="numpy",parallel=True)
def func(arr,div):
  return np.exp( - 0.5 * (myarr / div)**2 )

Run Code Online (Sandbox Code Playgroud)

时间安排

#Core i7 4771
#Windows 7 x64
#Anaconda Python 3.5.5
#Numba 0.41 (compilation overhead excluded)
func(myarr,0.1)                      -> 3.6ms
func(myarr,0.001)                    -> 3.8ms

#Numba (set NUMBA_DISABLE_INTEL_SVML=1), parallel=True
func(myarr,0.1)                      -> 5.19ms
func(myarr,0.001)                    -> 12.0ms

#Numba (set NUMBA_DISABLE_INTEL_SVML=1), parallel=False
func(myarr,0.1)                      -> 16.7ms
func(myarr,0.001)                    -> 63.2ms

#Numpy (1.13.3), set OMP_NUM_THREADS=4
np.exp( - 0.5 * (myarr / 0.001)**2 ) -> 70.82ms
np.exp( - 0.5 * (myarr / 0.1)**2 )   -> 12.58ms

#Numpy (1.13.3), set OMP_NUM_THREADS=1
np.exp( - 0.5 * (myarr / 0.001)**2 ) -> 189.4ms
np.exp( - 0.5 * (myarr / 0.1)**2 )   -> 17.4ms

#Numexpr (2.6.8), no SVML, parallel
ne.evaluate("exp( - 0.5 * (myarr / 0.001)**2 )") ->17.2ms
ne.evaluate("exp( - 0.5 * (myarr / 0.1)**2 )")   ->4.38ms

#Numexpr (2.6.8), no SVML, single threaded
ne.evaluate("exp( - 0.5 * (myarr / 0.001)**2 )") ->50.85ms
ne.evaluate("exp( - 0.5 * (myarr / 0.1)**2 )")   ->13.9ms

Run Code Online (Sandbox Code Playgroud)

归档时间：	7 年，5 月前
查看次数：	276 次
最近记录：	7 年，5 月前