火炬就地操作以节省内存（softmax）

Question

火炬就地操作以节省内存（softmax）

lhk*_*lhk 2 python machine-learning torch pytorch

torch 中的一些操作是就地执行的。例如，速记运算符如 +=。

是否可以就地执行其他操作，例如softmax？

我目前正在从事语言处理工作。该模型在大量词汇上生成一长串概率分布。最终输出张量约占分配内存的 60%。这是一个很大的问题，因为我需要计算它的 softmax ，这会使所需的内存加倍。

这是问题的一个例子。我对张量 t 不感兴趣，只对它的 softmax 感兴趣：

import numpy as np
import torch
import torch.nn.functional as F

t = torch.tensor(np.zeros((30000,30000))).cuda()  #allocates 6.71 GB of GPU
softmax = F.softmax(t, 1)  #out of memory error
del t  #too late, program crashed

Run Code Online (Sandbox Code Playgroud)

即使以下方法也不起作用：

F.softmax(torch.tensor(np.zeros((30000,30000))).cuda(), 1)

Run Code Online (Sandbox Code Playgroud)

Answer 1

lhk*_*lhk 6

我创建了一个就地版本的 softmax：

import numpy as np
import torch
import torch.nn.functional as F

# in-place version
t = torch.tensor(np.ones((100,200)))
torch.exp(t, out=t)
summed = torch.sum(t, dim=1, keepdim=True)
t /= summed

# original version
t2 = torch.tensor(np.ones((100,200)))
softmax = F.softmax(t2, 1)

assert torch.allclose(t, softmax)

Run Code Online (Sandbox Code Playgroud)

回答我的问题：如果您想要就地函数，您必须通过将低级操作插入在一起来自己创建它们：

许多函数例如torch.exp可以被赋予可选out参数。
作业t[idx] = something已就位
速记运算符/=, *=, +=,-=就位

这需要仔细调试并且可能不直观：

t = t / summed  #not in-place
t /= summed  #in-place

Run Code Online (Sandbox Code Playgroud)

我读过，就地操作可能会产生梯度问题。我将用这段代码做更多测试。

归档时间：	6 年，11 月前
查看次数：	3565 次
最近记录：	6 年，11 月前