在Pytorch（1.0）中具有类似外观的不同`grad_fn`

Question

在Pytorch（1.0）中具有类似外观的不同`grad_fn`

abk*_*kds 5 python attention-model pytorch

我正在开发一个注意力模型，在运行最终模型之前，我正在遍历代码的张量形状。我有一个需要重整张量的操作。张量是形状的torch.Size([[30, 8, 9, 64]]) 地方30是batch_size，8是（这是不相关的我的问题）注意头的数量9是在一句话的数量，64这个词的一些中间嵌入表示。torch.size([30, 9, 512])在进一步处理之前，我必须将张量重塑为的大小。因此，我在网上查找一些参考资料，他们做了以下工作，x.transpose(1, 2).contiguous().view(30, -1, 512) 而我认为这应该可行x.transpose(1, 2).reshape(30, -1, 512)。

在第一种情况下是grad_fnis <ViewBackward>，而在我的情况下是<UnsafeViewBackward>。这两个操作不一样吗？这会导致训练错误吗？

Answer 1

uke*_*emi 1

这两个不是同一个操作吗？

不会。虽然它们有效地产生相同的张量，但操作并不相同，并且不能保证它们具有相同的张量storage。

张量形状.cpp：

// _unsafe_view() differs from view() in that the returned tensor isn't treated
// as a view for the purposes of automatic differentiation. (It's not listed in
// VIEW_FUNCTIONS in gen_autograd.py).  It's only safe to use if the `self` tensor
// is temporary. For example, the viewed tensor here (a + b) is discarded immediately
// after viewing:
//
//  res = at::_unsafe_view(a + b, size);
//
// This is a hack because in-place operations on tensors treated like views
// can be much more expensive than the same operations on non-view tensors.

Run Code Online (Sandbox Code Playgroud)

请注意，如果应用于复杂的输入，这可能会产生错误，但这通常在 PyTorch 中尚未完全支持，并且不是此函数所独有的。

归档时间：	6 年，9 月前
查看次数：	219 次
最近记录：	6 年，9 月前