如何使用外积之和对近似矩阵乘法进行向量化？

Question

如何使用外积之和对近似矩阵乘法进行向量化？

Ati*_*Ali 7 python numpy matrix-multiplication

假设我有两个矩阵A和B，并且我想C = AB使用外积之和进行计算。我写了这个函数来实现这一点，但我想知道是否可以消除 for 循环并将其矢量化，

import numpy as np

def mul_opx(A, B, pd):
    # Approx. matrix multiplication using outer product
    n, m = A.shape
    p = B.shape[1]
    C = np.zeros((n,p), dtype=A.dtype)
    dum = np.zeros_like(C)
    for t in range(m):
        dum = np.outer(A[:,t],B[t,:]) / pd[t]
        C = C + dum
    C = C / m
    return C

d = 1000
A = np.arange(d**2).reshape((d,d))
B = np.arange(d**2).reshape((d,d))

# Full Matrix Multiplication
C = A @ B

# Approximate Matrix Multiplication
# choosing half random vectors/rows from A/B
k = np.random.choice(d, int(d/2))
Ap = A[:,k]
Bp = B[k,:]

# Unifrom probability vector
pd_uniform = np.full(d,1/d)

# Approximate product
C_hat = mul_opx(Ap,Bp, pd_uniform[k])

Run Code Online (Sandbox Code Playgroud)

当矩阵维度非常大（例如 10^6 x 10^6）时，此类产品非常有用

Answer 1

Lea*_*ess 6

正如其他人提到的，这可能是einsum的一个很好的用例。用该语言编写您的操作可以通过

np.einsum( 'ij,ik->jk',A,B)

Run Code Online (Sandbox Code Playgroud)

总和的重复i索引，j k外积的不重复索引。与 @Tomer 提出的答案相比，快速基准测试似乎显示出 2 倍的加速。当然，这取决于输入大小，我让您看看它如何推广到 10^6 范围内的线性大小，einsum 的内存占用也应该更好。

Answer 2

Tom*_*eva 4

您可以尝试使用np.multiply.outer(A,B).

假设您有以下数据：

a = np.array([[4, 2],
              [2, 2]])

b = np.array([[4, 3],
              [4, 0]])

Run Code Online (Sandbox Code Playgroud)

您想要执行以下两个乘法：

np.outer(a[:,0],b[0,:]) = array([[16, 12],
                                 [ 8,  6]])
np.outer(a[:,1],b[1,:]) = array([[8, 0],
                                 [8, 0]])

Run Code Online (Sandbox Code Playgroud)

这可以使用该方法来完成mp.multiply.outer。“将np.multiply.outerufunc 运算应用于所有对 (a, b)，其中 a 在 A 中，b 在 B 中。” （请参阅此处的说明）。A该函数执行和之间所有可能的外积B，在这个简单的示例中会产生一个(2,2,2,2)形状矩阵。显然，你不需要所有可能的外积，你只需要从这个矩阵中提取你想要的。你可以看到：

np.multiply.outer(a,b)[0,:,0,:] = array([[16, 12],
                                         [ 8,  6]])
np.multiply.outer(a,b)[1,:,1,:] = array([[8, 0],
                                         [8, 0]])

Run Code Online (Sandbox Code Playgroud)

使用此方法，您不需要执行 for 循环，但会执行冗余计算。但是，该numpy包已经过优化，也许这会更快（对于非常大的情况A，B您可以尝试使用 jit 装饰器来加快速度

另一种不进行冗余计算的方法是np.newaxis在乘法之前使用来扩展矩阵。

使用与上面相同的a，b执行以下操作：

a[:,:,np.newaxis] = array([[[4],
                            [2]],

                           [[2],
                            [2]]])
b[:,np.newaxis,:] = array([[[4, 3]],

                           [[4, 0]]])

Run Code Online (Sandbox Code Playgroud)

现在你可以简单地进行乘法来接收：

a[:,:,np.newaxis]*b[:,np.newaxis,:] = array([[[16, 12],
                                              [ 8,  6]],

                                             [[ 8,  0],
                                              [ 8,  0]]])

Run Code Online (Sandbox Code Playgroud)

这是没有冗余计算的外积的精确结果np.multiply.outer。剩下的就是对第一维求和，如下所示：

results = np.sum(a[:,:,np.newaxis]*b[:,np.newaxis,:], axis=0)

Run Code Online (Sandbox Code Playgroud)

继续第二个示例，扩展示例以将每个外积除以不同的数字可以按如下方式完成：

假设向量 pd 由数字组成（因为有两个外部产品），现在可以简单地使用以下命令完成所需的更改：

pd = np.array([[[6]],[[8]]])  # shape is (2,1,1) because there are two outer products
solution = np.sum((a[:,:,np.newaxis] * b[:,np.newaxis,:]) / pd, axis=0)

Run Code Online (Sandbox Code Playgroud)

另一种方法是将设置pd为 (1,2) 形状的数组，并在乘法之前除以 a：

pd = np.array([[6,8]])  # shape (1,2) beacuse there are two outer products
solution = np.sum((a / pd)[:,:,np.newaxis]* b[:,np.newaxis,:], axis=0)

Run Code Online (Sandbox Code Playgroud)

关于另一个解决方案中提出的爱因斯坦求和，您可以采用：

pd = np.array([[6,8]])  # shape (1,2) beacuse there are two outer products
solution = np.einsum( 'ij,ik->jk',a/pd,b)

Run Code Online (Sandbox Code Playgroud)

归档时间：	3 年，6 月前
查看次数：	661 次
最近记录：	3 年，6 月前