在python中的巨大矩阵的点积的行和

Aya*_*lam 14 python numpy matrix-multiplication dot-product

我有2矩阵100kx200和200x100k如果它们是小矩阵我会使用numpy dot产品

sum(a.dot(b), axis = 0)
Run Code Online (Sandbox Code Playgroud)

但矩阵太大了,我也不能使用循环是否有一个聪明的方法来做到这一点?

ken*_*ytm 18

可能的优化是

>>> numpy.sum(a @ b, axis=0)
array([  1.83633615,  18.71643672,  15.26981078, -46.33670382,  13.30276476])
>>> numpy.sum(a, axis=0) @ b
array([  1.83633615,  18.71643672,  15.26981078, -46.33670382,  13.30276476])
Run Code Online (Sandbox Code Playgroud)

计算a @ b需要10k×200×10k的运算,而首先对行进行求和会将乘法运算减少到1×200×10k,从而提高了10k×.

这主要是由于认识到

   numpy.sum(x, axis=0) == [1, 1, ..., 1] @ x
=> numpy.sum(a @ b, axis=0) == [1, 1, ..., 1] @ (a @ b)
                            == ([1, 1, ..., 1] @ a) @ b
                            == numpy.sum(a, axis=0) @ b
Run Code Online (Sandbox Code Playgroud)

与其他轴类似.

>>> numpy.sum(a @ b, axis=1)
array([  2.8794171 ,   9.12128399,  14.52009991,  -8.70177811, -15.0303783 ])
>>> a @ numpy.sum(b, axis=1)
array([  2.8794171 ,   9.12128399,  14.52009991,  -8.70177811, -15.0303783 ])
Run Code Online (Sandbox Code Playgroud)

(注:x @ y相当于x.dot(y)二维矩阵和1D载体上的Python 3.5+与numpy的1.10.0+)


$ INITIALIZATION='import numpy;numpy.random.seed(0);a=numpy.random.randn(1000,200);b=numpy.random.rand(200,1000)'

$ python3 -m timeit -s "$INITIALIZATION" 'numpy.einsum("ij,jk->k", a, b)'
10 loops, best of 3: 87.2 msec per loop

$ python3 -m timeit -s "$INITIALIZATION" 'numpy.sum(a@b, axis=0)'
100 loops, best of 3: 12.8 msec per loop

$ python3 -m timeit -s "$INITIALIZATION" 'numpy.sum(a, axis=0)@b'
1000 loops, best of 3: 300 usec per loop
Run Code Online (Sandbox Code Playgroud)

插图:

In [235]: a = np.random.rand(3,3)
array([[ 0.465,  0.758,  0.641],
       [ 0.897,  0.673,  0.742],
       [ 0.763,  0.274,  0.485]])

In [237]: b = np.random.rand(3,2)
array([[ 0.303,  0.378],
       [ 0.039,  0.095],
       [ 0.192,  0.668]])
Run Code Online (Sandbox Code Playgroud)

现在,如果我们这样做a @ b,我们需要18个乘法运算和6个加法运算.另一方面,如果我们这样做,np.sum(a, axis=0) @ b我们只需要6次乘法运算和2次加法运算.由于我们有3行,因此提高了3倍a.至于OP的情况,这应该比简单的a @ b计算提高10k倍,因为他有10k行a.

  • 关于你的注释只是一个小评论:`@`运算符不完全是python 3的特性,因为它只在[3.5发布]中引入(https://docs.python.org/3/whatsnew/3.5.html#whatsnew -pep-465). (3认同)

Div*_*kar 5

有两种sum-reductions情况发生 - 一种来自marix-multilication np.dot,然后是显式的sum.

我们可以np.einsum一次性使用这两种方法 - 就像这样 -

np.einsum('ij,jk->k',a,b)
Run Code Online (Sandbox Code Playgroud)

样品运行 -

In [27]: a = np.random.rand(3,4)

In [28]: b = np.random.rand(4,3)

In [29]: np.sum(a.dot(b), axis = 0)
Out[29]: array([ 2.70084316,  3.07448582,  3.28690401])

In [30]: np.einsum('ij,jk->k',a,b)
Out[30]: array([ 2.70084316,  3.07448582,  3.28690401])
Run Code Online (Sandbox Code Playgroud)

运行时测试 -

In [45]: a = np.random.rand(1000,200)

In [46]: b = np.random.rand(200,1000)

In [47]: %timeit np.sum(a.dot(b), axis = 0)
100 loops, best of 3: 5.5 ms per loop

In [48]: %timeit np.einsum('ij,jk->k',a,b)
10 loops, best of 3: 71.8 ms per loop
Run Code Online (Sandbox Code Playgroud)

可悲的是,看起来我们没有做得更好np.einsum.


要更改为np.sum(a.dot(b), axis = 1),只需在那里交换输出字符串表示法 - np.einsum('ij,jk->i',a,b)就像这样 -

In [42]: np.sum(a.dot(b), axis = 1)
Out[42]: array([ 3.97805141,  3.2249661 ,  1.85921549])

In [43]: np.einsum('ij,jk->i',a,b)
Out[43]: array([ 3.97805141,  3.2249661 ,  1.85921549])
Run Code Online (Sandbox Code Playgroud)