了解NumPy的einsum

Question

了解NumPy的einsum

Lan*_*ait 160 python arrays numpy multidimensional-array numpy-einsum

我很难理解究竟是如何einsum运作的.我看过文档和一些例子,但它似乎并不坚持.

这是我们在课堂上看到的一个例子:

C = np.einsum("ij,jk->ki", A, B)

Run Code Online (Sandbox Code Playgroud)

对于两个数组A和B

我想这会采取A^T * B,但我不确定(它正在将其中一个的转置正确吗？).任何人都可以告诉我这里发生了什么(通常在使用时einsum)？

Answer 1

Ale*_*ley 304

(注:这个答案是基于短的博客文章关于einsum我前一段时间写的.)

怎么`einsum`办？

想象一下,我们有两个多维数组,A和B.现在让我们假设我们想......

乘 A用B一种特殊的方式来创造新的产品阵列; 然后也许吧
将这个新阵列与特定轴相加 ; 然后也许吧
以特定顺序转置新阵列的轴.

有一个很好的机会einsum可以帮助我们更快,更有效地记忆NumPy功能的组合multiply,sum并且transpose允许.

`einsum`工作怎么样？

这是一个简单(但不是完全无关紧要)的例子.采用以下两个数组:

A = np.array([0, 1, 2])

B = np.array([[ 0,  1,  2,  3],
              [ 4,  5,  6,  7],
              [ 8,  9, 10, 11]])

Run Code Online (Sandbox Code Playgroud)

我们将乘A和B元素方面,然后沿新阵列的行总结.在"正常"的NumPy中,我们写道:

>>> (A[:, np.newaxis] * B).sum(axis=1)
array([ 0, 22, 76])

Run Code Online (Sandbox Code Playgroud)

所以这里,索引操作在A两个数组的第一轴上排成一行,以便可以广播乘法.然后将产品数组的行相加以返回答案.

现在,如果我们想要使用einsum,我们可以写:

>>> np.einsum('i,ij->i', A, B)
array([ 0, 22, 76])

Run Code Online (Sandbox Code Playgroud)

该签名字符串'i,ij->i'是这里的关键,需要解释的一点.你可以把它分成两半.在左侧(左侧->),我们标记了两个输入数组.在右边->,我们已经标记了我们想要最终得到的数组.

接下来会发生什么:

A有一个轴; 我们贴了标签i.并B有两个轴; 我们将轴0标记为i和轴1 标记为j.
通过在两个输入数组中重复标签i,我们告诉einsum这两个轴应该相乘.换句话说,我们将数组A与每列数组相乘B,就像A[:, np.newaxis] * B那样.
请注意,j在我们所需的输出中没有显示为标签; 我们刚刚使用过i(我们希望最终得到一维数组).通过省略标签,我们告诉einsum来总结沿着这条轴线.换句话说,我们正在总结产品的行,就像.sum(axis=1)那样.

这基本上就是您需要知道的全部内容einsum.它有助于玩一点; 如果我们在输出中留下两个标签'i,ij->ij',我们就会得到一个2D数组的产品(与之相同A[:, np.newaxis] * B).如果我们说没有输出标签,'i,ij->我们会得到一个数字(与行为相同(A[:, np.newaxis] * B).sum()).

einsum然而,最重要的是,它不是首先构建一系列临时产品; 它只是对产品进行总结.这可以大大节省内存使用量.

一个稍大的例子

为了解释点积,这里有两个新的数组:

A = array([[1, 1, 1],
           [2, 2, 2],
           [5, 5, 5]])

B = array([[0, 1, 0],
           [1, 1, 0],
           [1, 1, 1]])

Run Code Online (Sandbox Code Playgroud)

我们将使用计算点积np.einsum('ij,jk->ik', A, B).这是一张图片,显示了我们从函数中得到的A和B和输出数组的标记:

您可以看到标签j重复 - 这意味着我们将行A与列相乘B.此外,标签j不包含在输出中 - 我们正在总结这些产品.标签i并k保留输出,因此我们返回2D阵列.

这一结果与其中标签阵列比较可能是更加明显j的不求和.在左下方,您可以看到写作产生的3D数组np.einsum('ij,jk->ijk', A, B)(即我们保留了标签j):

求和轴j给出了预期的点积,如右图所示.

一些练习

为了获得更多的感觉einsum,使用下标符号实现熟悉的NumPy数组操作会很有用.任何涉及乘法和求和轴组合的东西都可以使用 einsum.

设A和B为两个长度相同的1D阵列.例如,A = np.arange(10)和B = np.arange(5, 15).

总和A可以写成:
```
np.einsum('i->', A)
```
Run Code Online (Sandbox Code Playgroud)
元素乘法A * B,可以写成:
```
np.einsum('i,i->i', A, B)
```
Run Code Online (Sandbox Code Playgroud)
内部产品或点产品,np.inner(A, B)或者np.dot(A, B),可以写成:
```
np.einsum('i,i->', A, B) # or just use 'i,i'
```
Run Code Online (Sandbox Code Playgroud)
外部产品np.outer(A, B)可以写成:
```
np.einsum('i,j->ij', A, B)
```
Run Code Online (Sandbox Code Playgroud)

对于2D阵列,C并且D,如果轴是兼容的长度(两个长度相同或长度为1的一个),这里有几个例子:

C(主对角线的总和)的轨迹np.trace(C)可以写成:
```
np.einsum('ii', C)
```
Run Code Online (Sandbox Code Playgroud)
的元素方式乘法C和转置D,C * D.T可以写成:
```
np.einsum('ij,ji->ij', C, D)
```
Run Code Online (Sandbox Code Playgroud)
将C数组的每个元素相乘D(以生成4D数组)C[:, :, None, None] * D,可以写成:
```
np.einsum('ij,kl->ijkl', C, D)  
```
Run Code Online (Sandbox Code Playgroud)

很好的解释，谢谢。“请注意，我没有在我们想要的输出中显示为标签”——不是吗？ (2认同)
非常好的答案。还值得注意的是，“ij,jk”可以单独工作（没有箭头）来形成矩阵乘法。但为了清楚起见，最好先放置箭头，然后放置输出尺寸。它在博客文章中。 (2认同)
@Peaceful：这是很难选择正确的词的场合之一！我觉得“列”在这里更合适，因为“A”的长度为 3，与“B”中列的长度相同（而“B”的行长度为 4，不能按元素乘以“ A`）。 (2认同)
请注意，省略 `->` 会影响语义：“在隐式模式下，所选的下标很重要，因为输出的轴按字母顺序重新排序。这意味着 `np.einsum('ij', a)` 不会影响 2D 数组，而 `np.einsum('ji', a)` 则采用其转置。” (2认同)

Answer 2

kma*_*o23 33

numpy.einsum()如果你直观地理解它,就很容易理解这个想法.作为示例,让我们从涉及矩阵乘法的简单描述开始.

要使用numpy.einsum(),您必须将所谓的下标字符串作为参数传递,然后输入您的输入数组.

假设您有两个2D数组,A并且B,您想要进行矩阵乘法.所以你也是:

np.einsum("ij, jk -> ik", A, B)

Run Code Online (Sandbox Code Playgroud)

在这里,下标串 ij对应于阵列A而下标串 jk对应于阵列B.此外,最重要的是要注意每个下标字符串中的字符数必须与数组的尺寸相匹配.(即2D阵列的两个字符,3D数组的三个字符,等等.)如果重复下标字符串之间的字符(在我们的例子中),那么这意味着您希望总和沿着这些维度发生.因此,他们将减少总和.(即那个维度将消失) jein

该标串在这之后->,将是我们得到的阵列.如果将其保留为空,则将对所有内容求和,并返回标量值作为结果.否则,结果数组将根据下标字符串具有维度.在我们的例子中,它将是ik.这是直观的,因为我们知道对于矩阵乘法,数组中的列数A必须与数组中的行数相匹配,B这就是这里发生的事情(即我们通过重复下标字符串中的char j来编码这些知识)

这里有一些例子说明了np.einsum()在实现一些常见张量或nd阵列操作中的用法.

输入

# a vector
In [197]: vec
Out[197]: array([0, 1, 2, 3])

# an array
In [198]: A
Out[198]: 
array([[11, 12, 13, 14],
       [21, 22, 23, 24],
       [31, 32, 33, 34],
       [41, 42, 43, 44]])

# another array
In [199]: B
Out[199]: 
array([[1, 1, 1, 1],
       [2, 2, 2, 2],
       [3, 3, 3, 3],
       [4, 4, 4, 4]])

Run Code Online (Sandbox Code Playgroud)

1)矩阵乘法(类似于np.matmul(arr1, arr2))

In [200]: np.einsum("ij, jk -> ik", A, B)
Out[200]: 
array([[130, 130, 130, 130],
       [230, 230, 230, 230],
       [330, 330, 330, 330],
       [430, 430, 430, 430]])

Run Code Online (Sandbox Code Playgroud)

2)沿主对角线提取元素(类似于np.diag(arr))

In [202]: np.einsum("ii -> i", A)
Out[202]: array([11, 22, 33, 44])

Run Code Online (Sandbox Code Playgroud)

3)Hadamard产品(即两个阵列的元素产品)(类似arr1 * arr2)

In [203]: np.einsum("ij, ij -> ij", A, B)
Out[203]: 
array([[ 11,  12,  13,  14],
       [ 42,  44,  46,  48],
       [ 93,  96,  99, 102],
       [164, 168, 172, 176]])

Run Code Online (Sandbox Code Playgroud)

4)元素平方(类似于np.square(arr)或arr ** 2)

In [210]: np.einsum("ij, ij -> ij", B, B)
Out[210]: 
array([[ 1,  1,  1,  1],
       [ 4,  4,  4,  4],
       [ 9,  9,  9,  9],
       [16, 16, 16, 16]])

Run Code Online (Sandbox Code Playgroud)

5)跟踪(即主对角元素的总和)(类似于np.trace(arr))

In [217]: np.einsum("ii -> ", A)
Out[217]: 110

Run Code Online (Sandbox Code Playgroud)

6)矩阵转置(类似于np.transpose(arr))

In [221]: np.einsum("ij -> ji", A)
Out[221]: 
array([[11, 21, 31, 41],
       [12, 22, 32, 42],
       [13, 23, 33, 43],
       [14, 24, 34, 44]])

Run Code Online (Sandbox Code Playgroud)

7)外部产品(矢量)(类似np.outer(vec1, vec2))

In [255]: np.einsum("i, j -> ij", vec, vec)
Out[255]: 
array([[0, 0, 0, 0],
       [0, 1, 2, 3],
       [0, 2, 4, 6],
       [0, 3, 6, 9]])

Run Code Online (Sandbox Code Playgroud)

8)内积(向量)(类似np.inner(vec1, vec2))

In [256]: np.einsum("i, i -> ", vec, vec)
Out[256]: 14

Run Code Online (Sandbox Code Playgroud)

9)沿轴0的总和(类似于np.sum(arr, axis=0))

In [260]: np.einsum("ij -> j", B)
Out[260]: array([10, 10, 10, 10])

Run Code Online (Sandbox Code Playgroud)

10)沿轴1的总和(类似于np.sum(arr, axis=1))

In [261]: np.einsum("ij -> i", B)
Out[261]: array([ 4,  8, 12, 16])

Run Code Online (Sandbox Code Playgroud)

11)批量矩阵乘法

In [287]: BM = np.stack((A, B), axis=0)

In [288]: BM
Out[288]: 
array([[[11, 12, 13, 14],
        [21, 22, 23, 24],
        [31, 32, 33, 34],
        [41, 42, 43, 44]],

       [[ 1,  1,  1,  1],
        [ 2,  2,  2,  2],
        [ 3,  3,  3,  3],
        [ 4,  4,  4,  4]]])

In [289]: BM.shape
Out[289]: (2, 4, 4)

# batch matrix multiply using einsum
In [292]: BMM = np.einsum("bij, bjk -> bik", BM, BM)

In [293]: BMM
Out[293]: 
array([[[1350, 1400, 1450, 1500],
        [2390, 2480, 2570, 2660],
        [3430, 3560, 3690, 3820],
        [4470, 4640, 4810, 4980]],

       [[  10,   10,   10,   10],
        [  20,   20,   20,   20],
        [  30,   30,   30,   30],
        [  40,   40,   40,   40]]])

In [294]: BMM.shape
Out[294]: (2, 4, 4)

Run Code Online (Sandbox Code Playgroud)

12)沿轴2的总和(类似于np.sum(arr, axis=2))

In [330]: np.einsum("ijk -> ij", BM)
Out[330]: 
array([[ 50,  90, 130, 170],
       [  4,   8,  12,  16]])

Run Code Online (Sandbox Code Playgroud)

13)求和数组中的所有元素(类似于np.sum(arr))

In [335]: np.einsum("ijk -> ", BM)
Out[335]: 480

Run Code Online (Sandbox Code Playgroud)

14)多个轴的和(即边缘化)
(类似于np.sum(arr, axis=(axis0, axis1, axis2, axis3, axis4, axis6, axis7)))

# 8D array
In [354]: R = np.random.standard_normal((3,5,4,6,8,2,7,9))

# marginalize out axis 5 (i.e. "n" here)
In [363]: esum = np.einsum("ijklmnop -> n", R)

# marginalize out axis 5 (i.e. sum over rest of the axes)
In [364]: nsum = np.sum(R, axis=(0,1,2,3,4,6,7))

In [365]: np.allclose(esum, nsum)
Out[365]: True

Run Code Online (Sandbox Code Playgroud)

15)双点产品(类似于np.sum(hadamard-product) cf. 3)

In [772]: A
Out[772]: 
array([[1, 2, 3],
       [4, 2, 2],
       [2, 3, 4]])

In [773]: B
Out[773]: 
array([[1, 4, 7],
       [2, 5, 8],
       [3, 6, 9]])

In [774]: np.einsum("ij, ij -> ", A, B)
Out[774]: 124

Run Code Online (Sandbox Code Playgroud)

16)2D和3D阵列乘法

当求解要验证结果的线性方程组(Ax = b)时,这种乘法可能非常有用.

# inputs
In [115]: A = np.random.rand(3,3)
In [116]: b = np.random.rand(3, 4, 5)

# solve for x
In [117]: x = np.linalg.solve(A, b.reshape(b.shape[0], -1)).reshape(b.shape)

# 2D and 3D array multiplication :)
In [118]: Ax = np.einsum('ij, jkl', A, x)

# indeed the same!
In [119]: np.allclose(Ax, b)
Out[119]: True

Run Code Online (Sandbox Code Playgroud)

相反,如果必须使用np.matmul()这个验证,我们必须做几个reshapes来达到这个目的:

# reshape 3D array `x` to 2D, perform matmul
# then reshape the resultant array to 3D
In [123]: Ax_matmul = np.matmul(A, x.reshape(x.shape[0], -1)).reshape(x.shape)

# indeed correct!
In [124]: np.allclose(Ax, Ax_matmul)
Out[124]: True

Run Code Online (Sandbox Code Playgroud)

额外奖励:在这里阅读更多数学:爱因斯坦求和,绝对是:Tensor-Notation

这是答案的瑰宝。 (3认同)

Answer 3

Iva*_*van 13

另一种观点`np.einsum`

这里的大多数答案都通过示例进行解释，我想我应该给出额外的观点。

您可以将其视为einsum广义矩阵求和运算符。给定的字符串包含下标，它们是代表轴的标签。我喜欢将其视为您的操作定义。下标提供了两个明显的约束：

每个输入数组的轴数，
输入之间的轴大小相等。

让我们看最初的例子：np.einsum('ij,jk->ki', A, B)。这里约束1.转换为 A.ndim == 2和B.ndim == 2，2.转换为A.shape[1] == B.shape[0]。

正如您稍后将看到的，还有其他限制。例如：

输出下标中的标签不得出现多次。
输出下标中的标签必须出现在输入下标中。

当查看时ij,jk->ki，您可以将其视为：

输入数组中的哪些分量将构成[k, i]输出数组的分量。

下标包含输出数组的每个组件的操作的准确定义。

我们将坚持使用操作ij,jk->ki以及A和的以下定义B：

>>> A = np.array([[1,4,1,7], [8,1,2,2], [7,4,3,4]])
>>> A.shape
(3, 4)

>>> B = np.array([[2,5], [0,1], [5,7], [9,2]])
>>> B.shape
(4, 2)

Run Code Online (Sandbox Code Playgroud)

输出的Z形状为，(B.shape[1], A.shape[0])并且可以通过以下方式简单地构造。从一个空白数组开始Z：

Z = np.zeros((B.shape[1], A.shape[0]))
for i in range(A.shape[0]):
    for j in range(A.shape[1]):
        for k range(B.shape[0]):
           Z[k, i] += A[i, j]*B[j, k] # ki <- ij*jk

Run Code Online (Sandbox Code Playgroud)

np.einsum是关于在输出数组中累积贡献。每一(A[i,j], B[j,k])对都对每个Z[k, i]组件都有贡献。

您可能已经注意到，它看起来与计算一般矩阵乘法的方式非常相似......

最少的实施

这是 Python 中的一个最小实现np.einsum。这应该有助于了解幕后到底发生了什么。

随着我们的继续，我将继续参考前面的例子。定义inputs为[A, B].

np.einsum实际上可以接受两个以上的输入。下面，我们将重点关注一般情况：n 个输入和n 个输入下标。主要目标是找到迭代域，即我们所有范围的笛卡尔积。

我们不能依赖手动编写for循环，因为我们不知道会有多少个循环。主要思想是这样的：我们需要找到所有唯一的标签（我将使用key并keys引用它们），找到相应的数组形状，然后为每个标签创建范围，并使用计算范围的乘积itertools.product来获取学习。

指数	`keys`	限制条件	`sizes`	`ranges`
1	`'i'`	`A.shape[0]`	3	`range(0, 3)`
2	`'j'`	`A.shape[1] == B.shape[0]`	4	`range(0, 4)`
0	`'k'`	`B.shape[1]`	2	`range(0, 2)`

研究领域是笛卡尔积：range(0, 2) x range(0, 3) x range(0, 4)。

下标处理：

>>> expr = 'ij,jk->ki'
>>> qry_expr, res_expr = expr.split('->')
>>> inputs_expr = qry_expr.split(',')
>>> inputs_expr, res_expr
(['ij', 'jk'], 'ki')

Run Code Online (Sandbox Code Playgroud)

查找输入下标中的唯一键（标签）：
```
>>> keys = set([(key, size) for keys, input in zip(inputs_expr, inputs) 
               for key, size in list(zip(keys, input.shape))])
{('i', 3), ('j', 4), ('k', 2)}
```
Run Code Online (Sandbox Code Playgroud)
我们应该检查约束（以及输出下标）！使用set是一个坏主意，但它可以用于本示例的目的。
获取关联的大小（用于初始化输出数组）并构造范围（用于创建迭代域）：
```
>>> sizes = dict(keys)
{'i': 3, 'j': 4, 'k': 2}

>>> ranges = [range(size) for _, size in keys]
[range(0, 2), range(0, 3), range(0, 4)]
```
Run Code Online (Sandbox Code Playgroud)
我们需要一个包含键（标签）的列表：

>>> to_key = sizes.keys() ['k', 'i', 'j']
Run Code Online (Sandbox Code Playgroud)

range计算s的笛卡尔积

>>> domain = product(*ranges)
Run Code Online (Sandbox Code Playgroud)
注意：[itertools.product][1]返回一个随着时间的推移而被消耗的迭代器。

将输出张量初始化为：

>>> res = np.zeros([sizes[key] for key in res_expr])
Run Code Online (Sandbox Code Playgroud)

我们将循环domain：

>>> for indices in domain: ... pass
Run Code Online (Sandbox Code Playgroud)
对于每次迭代，indices将包含每个轴上的值。在我们的示例中，这将提供i, j, 和k作为元组: (k, i, j)。对于每个输入（A和B），我们需要确定要获取哪个组件。那就是A[i, j]，B[j, k]是的！然而，从字面上看，我们没有变量i、j、和k。

我们可以indices使用zip 来创建每个键（标签）与其当前值to_key之间的映射：

>>> vals = dict(zip(to_key, indices))
Run Code Online (Sandbox Code Playgroud)
为了获取输出数组的坐标，我们使用vals并循环键：[vals[key] for key in res_expr]。但是，要使用它们来索引输出数组，我们需要将其包裹起来tuple并zip沿每个轴分隔索引：

>>> res_ind = tuple(zip([vals[key] for key in res_expr]))
Run Code Online (Sandbox Code Playgroud)
输入索引相同（尽管可以有多个）：

>>> inputs_ind = [tuple(zip([vals[key] for key in expr])) for expr in inputs_expr]
Run Code Online (Sandbox Code Playgroud)

我们将使用 aitertools.reduce来计算所有贡献组件的乘积：

>>> def reduce_mult(L): ... return reduce(lambda x, y: x*y, L)
Run Code Online (Sandbox Code Playgroud)

总体而言，域上的循环如下所示：

>>> for indices in domain: ... vals = {k: v for v, k in zip(indices, to_key)} ... res_ind = tuple(zip([vals[key] for key in res_expr])) ... inputs_ind = [tuple(zip([vals[key] for key in expr])) ... for expr in inputs_expr] ... ... res[res_ind] += reduce_mult([M[i] for M, i in zip(inputs, inputs_ind)])
Run Code Online (Sandbox Code Playgroud)

>>> res array([[70., 44., 65.], [30., 59., 68.]])
Run Code Online (Sandbox Code Playgroud)
这与np.einsum('ij,jk->ki', A, B)返回值非常接近！

伟大的努力，顺便说一句，你知道“ein”这个词是怎么来的吗？ (3认同)

谢谢你！这是 [Albert ***Ein***stein](https://en.wikipedia.org/wiki/Einstein_notation) 本人！ (3认同)

Answer 4

Ste*_*nev 10

在阅读 einsum 方程时，我发现能够在精神上将它们归结为它们的命令式版本是最有帮助的。

让我们从以下（强加的）声明开始：

C = np.einsum('bhwi,bhwj->bij', A, B)

Run Code Online (Sandbox Code Playgroud)

首先检查标点符号，我们看到我们有两个 4 字母逗号分隔的 blob -bhwi和bhwj，在箭头之前，bij在它之后有一个 3 字母 blob 。因此，该方程从两个 rank-4 张量输入生成一个 rank-3 张量结果。

现在，让每个 blob 中的每个字母都是范围变量的名称。字母在 blob 中出现的位置是它在该张量中范围的轴的索引。因此，生成 C 的每个元素的命令式求和必须从三个嵌套的 for 循环开始，每个循环对应 C 的每个索引。

for b in range(...):
    for i in range(...):
        for j in range(...):
            # the variables b, i and j index C in the order of their appearance in the equation
            C[b, i, j] = ...

Run Code Online (Sandbox Code Playgroud)

因此，本质上，您for对 C 的每个输出索引都有一个循环。我们暂时不确定范围。

接下来我们看看左侧 - 是否有没有出现在右侧的范围变量？在我们的例子中 - 是的，h并且w。for为每个这样的变量添加一个内部嵌套循环：

for b in range(...):
    for i in range(...):
        for j in range(...):
            C[b, i, j] = 0
            for h in range(...):
                for w in range(...):
                    ...

Run Code Online (Sandbox Code Playgroud)

在最里面的循环中，我们现在已经定义了所有索引，因此我们可以编写实际求和并完成翻译：

# three nested for-loops that index the elements of C
for b in range(...):
    for i in range(...):
        for j in range(...):

            # prepare to sum
            C[b, i, j] = 0

            # two nested for-loops for the two indexes that don't appear on the right-hand side
            for h in range(...):
                for w in range(...):
                    # Sum! Compare the statement below with the original einsum formula
                    # 'bhwi,bhwj->bij'

                    C[b, i, j] += A[b, h, w, i] * B[b, h, w, j]

Run Code Online (Sandbox Code Playgroud)

如果到目前为止您已经能够遵循代码，那么恭喜您！这就是阅读 einsum 方程所需的全部内容。请特别注意原始 einsum 公式如何映射到上述代码段中的最终求和语句。for 循环和范围边界只是浮云，最后的语句是您真正需要了解正在发生的事情的全部内容。

为了完整起见，让我们看看如何确定每个范围变量的范围。嗯，每个变量的范围只是它索引的维度的长度。显然，如果一个变量在一个或多个张量中索引多个维度，那么每个维度的长度必须相等。这是上面包含完整范围的代码：

# C's shape is determined by the shapes of the inputs
# b indexes both A and B, so its range can come from either A.shape or B.shape
# i indexes only A, so its range can only come from A.shape, the same is true for j and B
assert A.shape[0] == B.shape[0]
assert A.shape[1] == B.shape[1]
assert A.shape[2] == B.shape[2]
C = np.zeros((A.shape[0], A.shape[3], B.shape[3]))
for b in range(A.shape[0]): # b indexes both A and B, or B.shape[0], which must be the same
    for i in range(A.shape[3]):
        for j in range(B.shape[3]):
            # h and w can come from either A or B
            for h in range(A.shape[1]):
                for w in range(A.shape[2]):
                    C[b, i, j] += A[b, h, w, i] * B[b, h, w, j]

Run Code Online (Sandbox Code Playgroud)

Answer 5

hpa*_*ulj 7

让我们制作2个阵列,具有不同但兼容的尺寸,以突出它们的相互作用

In [43]: A=np.arange(6).reshape(2,3)
Out[43]: 
array([[0, 1, 2],
       [3, 4, 5]])


In [44]: B=np.arange(12).reshape(3,4)
Out[44]: 
array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])

Run Code Online (Sandbox Code Playgroud)

你的计算,得到a(2,3)与a(3,4)的'dot'(乘积之和)来产生(4,2)数组. i是第一个A,最后一个C; k最后一个B,第一个C. j被总和'消耗'.

In [45]: C=np.einsum('ij,jk->ki',A,B)
Out[45]: 
array([[20, 56],
       [23, 68],
       [26, 80],
       [29, 92]])

Run Code Online (Sandbox Code Playgroud)

这与np.dot(A,B).T- 它是转置的最终输出相同.

要查看更多内容j,请将C下标更改为ijk:

In [46]: np.einsum('ij,jk->ijk',A,B)
Out[46]: 
array([[[ 0,  0,  0,  0],
        [ 4,  5,  6,  7],
        [16, 18, 20, 22]],

       [[ 0,  3,  6,  9],
        [16, 20, 24, 28],
        [40, 45, 50, 55]]])

Run Code Online (Sandbox Code Playgroud)

这也可以通过以下方式生成:

A[:,:,None]*B[None,:,:]

Run Code Online (Sandbox Code Playgroud)

也就是说,k在末尾A和i前面添加一个维度B,得到一个(2,3,4)数组.

0 + 4 + 16 = 20,9 + 28 + 55 = 92等; 求和j并转置以获得更早的结果:

np.sum(A[:,:,None] * B[None,:,:], axis=1).T

# C[k,i] = sum(j) A[i,j (,k) ] * B[(i,)  j,k]

Run Code Online (Sandbox Code Playgroud)

Answer 6

wwi*_*wii 5

我发现了NumPy:交易的技巧(第二部分)具有指导意义

我们使用 - >来表示输出数组的顺序.因此,将'ij,i-> j'视为左手侧(LHS)和右手侧(RHS).LHS上的任何重复标签都会计算产品元素,然后求和.通过改变RHS(输出)侧的标签,我们可以定义我们想要相对于输入数组进行的轴,即沿轴0,1等的求和.

import numpy as np

>>> a
array([[1, 1, 1],
       [2, 2, 2],
       [3, 3, 3]])
>>> b
array([[0, 1, 2],
       [3, 4, 5],
       [6, 7, 8]])
>>> d = np.einsum('ij, jk->ki', a, b)

Run Code Online (Sandbox Code Playgroud)

注意有三个轴,i,j,k,并且重复j(在左侧). i,j表示行和列a.j,k对b.

为了计算产品并对齐j轴,我们需要添加一个轴a.(b将沿(？)第一轴播出)

a[i, j, k]
   b[j, k]

>>> c = a[:,:,np.newaxis] * b
>>> c
array([[[ 0,  1,  2],
        [ 3,  4,  5],
        [ 6,  7,  8]],

       [[ 0,  2,  4],
        [ 6,  8, 10],
        [12, 14, 16]],

       [[ 0,  3,  6],
        [ 9, 12, 15],
        [18, 21, 24]]])

Run Code Online (Sandbox Code Playgroud)

j在右侧没有,所以我们总结了j3x3x3阵列的第二个轴

>>> c = c.sum(1)
>>> c
array([[ 9, 12, 15],
       [18, 24, 30],
       [27, 36, 45]])

Run Code Online (Sandbox Code Playgroud)

最后,指数在右侧(按字母顺序)反转,因此我们进行转置.

>>> c.T
array([[ 9, 18, 27],
       [12, 24, 36],
       [15, 30, 45]])

>>> np.einsum('ij, jk->ki', a, b)
array([[ 9, 18, 27],
       [12, 24, 36],
       [15, 30, 45]])
>>>

Run Code Online (Sandbox Code Playgroud)

请不要删除您的答案。 (2认同)

归档时间：	11 年，5 月前
查看次数：	42312 次
最近记录：	6 年，3 月前

了解NumPy的einsum

怎么einsum办？

einsum工作怎么样？

一个稍大的例子

一些练习

另一种观点np.einsum

最少的实施

怎么`einsum`办？

`einsum`工作怎么样？

另一种观点`np.einsum`