哪种方法更快?为什么 np.sum(arr) vs arr.sum()?

kus*_*hah 0 python numpy sum time-complexity space-complexity

哪种方法更快?他们不是一样吗?

start = time.time()
arr = np.array([1,2,3,4,5,6,7,8,9,0,12])
total_price =  np.sum(arr[arr < 7])* 2.14

print(total_price)
print('Duration: {} seconds'.format(time.time() - start))
Run Code Online (Sandbox Code Playgroud)
start = time.time()
arr = np.array([1,2,3,4,5,6,7,8,9,0,12])
total_price =  (arr[arr<7]).sum()* 2.14

print(total_price)
print('Duration: {} seconds'.format(time.time() - start))
Run Code Online (Sandbox Code Playgroud)

一次又一次地运行代码时,它们都会给出不同的最终执行时间。有时前一种方法更快,有时更晚。

hpa*_*ulj 5

删除巨大的文档字符串,代码np.sum

\n
@array_function_dispatch(_sum_dispatcher)\ndef sum(a, axis=None, dtype=None, out=None, keepdims=np._NoValue,\n        initial=np._NoValue, where=np._NoValue):\n\n    if isinstance(a, _gentype):\n        # 2018-02-25, 1.15.0\n        warnings.warn(\n            "Calling np.sum(generator) is deprecated, and in the future will give a different result. "\n            "Use np.sum(np.fromiter(generator)) or the python sum builtin instead.",\n            DeprecationWarning, stacklevel=3)\n\n        res = _sum_(a)\n        if out is not None:\n            out[...] = res\n            return out\n        return res\n\n    return _wrapreduction(a, np.add, \'sum\', axis, dtype, out, keepdims=keepdims,\n                          initial=initial, where=where)\n
Run Code Online (Sandbox Code Playgroud)\n

array_function_dispatch处理非 NumPy 类型可能提供的__array_function__覆盖_wrapreduction,同时负责确保np._NoValue不传递给底层实现,以及决定是否调用该sum方法(对于非数组输入)或add.reduce(对于数组输入)。

\n

因此,它会执行一系列检查来处理非数组输入,然后最终将任务传递给np.add.reduce输入是否为数组。

\n

同时,np.ndarray.sum这样的

\n
static PyObject *\narray_sum(PyArrayObject *self, PyObject *args, PyObject *kwds)\n{\n    NPY_FORWARD_NDARRAY_METHOD("_sum");\n}\n
Run Code Online (Sandbox Code Playgroud)\n

其中NPY_FORWARD_NDARRAY_METHOD是将操作转发到 的宏numpy.core._methods._sum

\n
def _sum(a, axis=None, dtype=None, out=None, keepdims=False,\n         initial=_NoValue, where=True):\n    return umr_sum(a, axis, dtype, out, keepdims, initial, where)\n
Run Code Online (Sandbox Code Playgroud)\n

并且umr_sum是 的别名np.add.reduce

\n

两个代码路径最终都以 结尾np.add.reduce,但ndarray.sum代码路径不涉及非数组输入的所有预检查工作,因为数组已经知道它是一个数组。

\n

在这些测试中,计算时间本身足够短,因此广泛的预检查会产生很大的差异:

\n
In [607]: timeit np.sum(np.arange(1000))                                                 \n15.4 \xc2\xb5s \xc2\xb1 42.1 ns per loop (mean \xc2\xb1 std. dev. of 7 runs, 100000 loops each)\nIn [608]: timeit np.arange(1000).sum()                                                   \n12.2 \xc2\xb5s \xc2\xb1 29.9 ns per loop (mean \xc2\xb1 std. dev. of 7 runs, 100000 loops each)\nIn [609]: timeit np.add.reduce(np.arange(1000))                                          \n9.19 \xc2\xb5s \xc2\xb1 17.1 ns per loop (mean \xc2\xb1 std. dev. of 7 runs, 100000 loops each)\n
Run Code Online (Sandbox Code Playgroud)\n

numpy有许多这样的函数/方法对。使用最方便的 - 并且在代码中看起来最漂亮的!

\n