了解tensordot

flo*_*o29 22 python numpy linear-algebra dot-product tensor

在我学会了如何使用之后einsum,我现在正试图了解它是如何np.tensordot工作的.

但是,我有点迷失,特别是关于参数的各种可能性axes.

要理解它,因为我从未练习过张量微积分,我使用以下示例:

A = np.random.randint(2, size=(2, 3, 5))
B = np.random.randint(2, size=(3, 2, 4))
Run Code Online (Sandbox Code Playgroud)

在这种情况下,有什么不同可能np.tensordot,你会如何手动计算?

Div*_*kar 28

这个想法tensordot很简单 - 我们输入数组和相应的轴,沿着这些轴减少总和.参与和减少的轴在输出中被移除,并且来自输入数组的所有剩余轴在输出中展开为不同的轴,保持输入阵列的馈送顺序.

让我们看几个带有一个和两个减少轴的样本案例,并交换输入位置,看看如何在输出中保留顺序.

I.一个减少的轴

输入:

 In [7]: A = np.random.randint(2, size=(2, 6, 5))
   ...:  B = np.random.randint(2, size=(3, 2, 4))
   ...: 
Run Code Online (Sandbox Code Playgroud)

情况1:

In [9]: np.tensordot(A, B, axes=((0),(1))).shape
Out[9]: (6, 5, 3, 4)

A : (2, 6, 5) -> reduction of axis=0
B : (3, 2, 4) -> reduction of axis=1

Output : `(2, 6, 5)`, `(3, 2, 4)` ===(2 gone)==> `(6,5)` + `(3,4)` => `(6,5,3,4)`
Run Code Online (Sandbox Code Playgroud)

案例#2(与案例#1相同,但输入被交换):

In [8]: np.tensordot(B, A, axes=((1),(0))).shape
Out[8]: (3, 4, 6, 5)

B : (3, 2, 4) -> reduction of axis=1
A : (2, 6, 5) -> reduction of axis=0

Output : `(3, 2, 4)`, `(2, 6, 5)` ===(2 gone)==> `(3,4)` + `(6,5)` => `(3,4,6,5)`.
Run Code Online (Sandbox Code Playgroud)

II.两轴减和

输入:

In [11]: A = np.random.randint(2, size=(2, 3, 5))
    ...: B = np.random.randint(2, size=(3, 2, 4))
    ...: 
Run Code Online (Sandbox Code Playgroud)

情况1:

In [12]: np.tensordot(A, B, axes=((0,1),(1,0))).shape
Out[12]: (5, 4)

A : (2, 3, 5) -> reduction of axis=(0,1)
B : (3, 2, 4) -> reduction of axis=(1,0)

Output : `(2, 3, 5)`, `(3, 2, 4)` ===(2,3 gone)==> `(5)` + `(4)` => `(5,4)`
Run Code Online (Sandbox Code Playgroud)

案例#2:

In [14]: np.tensordot(B, A, axes=((1,0),(0,1))).shape
Out[14]: (4, 5)

B : (3, 2, 4) -> reduction of axis=(1,0)
A : (2, 3, 5) -> reduction of axis=(0,1)

Output : `(3, 2, 4)`, `(2, 3, 5)` ===(2,3 gone)==> `(4)` + `(5)` => `(4,5)`
Run Code Online (Sandbox Code Playgroud)

我们可以将它扩展到尽可能多的轴.

  • @ floflo29你可能知道矩阵乘法涉及元素乘法,保持轴对齐,然后沿着公共对齐轴求和元素.通过这个总和,我们正在失去那个被称为减少的共同轴,因此在简短的总和减少中. (3认同)
  • 减和到底是什么意思? (2认同)
  • @BryanHead使用`np.tensordot`重新排序输出轴的唯一方法是交换输入.如果它没有得到你想要的那个,`transpose`将是你要走的路. (2认同)

hpa*_*ulj 7

tensordot交换轴并重塑输入,以便它可以应用于np.dot2 个二维数组。然后它交换并重塑回目标。实验可能比解释更容易。没有特殊的张量数学正在进行,只是扩展dot到更高维度的工作。tensor只是意味着超过 2d 的数组。如果您已经对此感到满意,einsum那么将结果与其进行比较将是最简单的。

样本测试,对 1 对轴求和

In [823]: np.tensordot(A,B,[0,1]).shape
Out[823]: (3, 5, 3, 4)
In [824]: np.einsum('ijk,lim',A,B).shape
Out[824]: (3, 5, 3, 4)
In [825]: np.allclose(np.einsum('ijk,lim',A,B),np.tensordot(A,B,[0,1]))
Out[825]: True
Run Code Online (Sandbox Code Playgroud)

另一个,总结为两个。

In [826]: np.tensordot(A,B,[(0,1),(1,0)]).shape
Out[826]: (5, 4)
In [827]: np.einsum('ijk,jim',A,B).shape
Out[827]: (5, 4)
In [828]: np.allclose(np.einsum('ijk,jim',A,B),np.tensordot(A,B,[(0,1),(1,0)]))
Out[828]: True
Run Code Online (Sandbox Code Playgroud)

我们可以对这(1,0)对做同样的事情。鉴于维度的混合,我认为没有另一种组合。

  • 等价于具有 `axes=([1,0],[0,1])` 的 `tensordot` 的 `einsum` 是 `np.einsum('ijk,jil->kl',a,b)`。这个 `dot` 也可以这样做:`aTreshape(5,12).dot(b.reshape(12,2))`。“点”在 (5,12) 和 (12,2) 之间。`aT` 将 5 放在首位,并交换 (3,4) 以匹配 `b`。 (2认同)

der*_*eks 6

上面的答案很好,对我的理解有很大帮助tensordot。但它们没有显示操作背后的实际数学。这就是为什么我自己在 TF 2 中做了类似的操作,并决定在这里分享它们:

a = tf.constant([1,2.])
b = tf.constant([2,3.])
print(f"{tf.tensordot(a, b, 0)}\t tf.einsum('i,j', a, b)\t\t- ((the last 0 axes of a), (the first 0 axes of b))")
print(f"{tf.tensordot(a, b, ((),()))}\t tf.einsum('i,j', a, b)\t\t- ((() axis of a), (() axis of b))")
print(f"{tf.tensordot(b, a, 0)}\t tf.einsum('i,j->ji', a, b)\t- ((the last 0 axes of b), (the first 0 axes of a))")
print(f"{tf.tensordot(a, b, 1)}\t\t tf.einsum('i,i', a, b)\t\t- ((the last 1 axes of a), (the first 1 axes of b))")
print(f"{tf.tensordot(a, b, ((0,), (0,)))}\t\t tf.einsum('i,i', a, b)\t\t- ((0th axis of a), (0th axis of b))")
print(f"{tf.tensordot(a, b, (0,0))}\t\t tf.einsum('i,i', a, b)\t\t- ((0th axis of a), (0th axis of b))")

[[2. 3.]
 [4. 6.]]    tf.einsum('i,j', a, b)     - ((the last 0 axes of a), (the first 0 axes of b))
[[2. 3.]
 [4. 6.]]    tf.einsum('i,j', a, b)     - ((() axis of a), (() axis of b))
[[2. 4.]
 [3. 6.]]    tf.einsum('i,j->ji', a, b) - ((the last 0 axes of b), (the first 0 axes of a))
8.0          tf.einsum('i,i', a, b)     - ((the last 1 axes of a), (the first 1 axes of b))
8.0          tf.einsum('i,i', a, b)     - ((0th axis of a), (0th axis of b))
8.0          tf.einsum('i,i', a, b)     - ((0th axis of a), (0th axis of b))
Run Code Online (Sandbox Code Playgroud)

对于(2,2)形状:

a = tf.constant([[1,2],
                 [-2,3.]])

b = tf.constant([[-2,3],
                 [0,4.]])
print(f"{tf.tensordot(a, b, 0)}\t tf.einsum('ij,kl', a, b)\t- ((the last 0 axes of a), (the first 0 axes of b))")
print(f"{tf.tensordot(a, b, (0,0))}\t tf.einsum('ij,ik', a, b)\t- ((0th axis of a), (0th axis of b))")
print(f"{tf.tensordot(a, b, (0,1))}\t tf.einsum('ij,ki', a, b)\t- ((0th axis of a), (1st axis of b))")
print(f"{tf.tensordot(a, b, 1)}\t tf.matmul(a, b)\t\t- ((the last 1 axes of a), (the first 1 axes of b))")
print(f"{tf.tensordot(a, b, ((1,), (0,)))}\t tf.einsum('ij,jk', a, b)\t- ((1st axis of a), (0th axis of b))")
print(f"{tf.tensordot(a, b, (1, 0))}\t tf.matmul(a, b)\t\t- ((1st axis of a), (0th axis of b))")
print(f"{tf.tensordot(a, b, 2)}\t tf.reduce_sum(tf.multiply(a, b))\t- ((the last 2 axes of a), (the first 2 axes of b))")
print(f"{tf.tensordot(a, b, ((0,1), (0,1)))}\t tf.einsum('ij,ij->', a, b)\t\t- ((0th axis of a, 1st axis of a), (0th axis of b, 1st axis of b))")
[[[[-2.  3.]
   [ 0.  4.]]
  [[-4.  6.]
   [ 0.  8.]]]

 [[[ 4. -6.]
   [-0. -8.]]
  [[-6.  9.]
   [ 0. 12.]]]]  tf.einsum('ij,kl', a, b)   - ((the last 0 axes of a), (the first 0 axes of b))
[[-2. -5.]
 [-4. 18.]]      tf.einsum('ij,ik', a, b)   - ((0th axis of a), (0th axis of b))
[[-8. -8.]
 [ 5. 12.]]      tf.einsum('ij,ki', a, b)   - ((0th axis of a), (1st axis of b))
[[-2. 11.]
 [ 4.  6.]]      tf.matmul(a, b)            - ((the last 1 axes of a), (the first 1 axes of b))
[[-2. 11.]
 [ 4.  6.]]      tf.einsum('ij,jk', a, b)   - ((1st axis of a), (0th axis of b))
[[-2. 11.]
 [ 4.  6.]]      tf.matmul(a, b)            - ((1st axis of a), (0th axis of b))
16.0    tf.reduce_sum(tf.multiply(a, b))    - ((the last 2 axes of a), (the first 2 axes of b))
16.0    tf.einsum('ij,ij->', a, b)          - ((0th axis of a, 1st axis of a), (0th axis of b, 1st axis of b))
Run Code Online (Sandbox Code Playgroud)