我这样训练我的模型:
for i in range(5):
optimizer.zero_grad()
y = next_input()
loss = model(y)
loss.backward()
optimizer.step()
Run Code Online (Sandbox Code Playgroud)
并得到这个错误
RuntimeError: Trying to backward through the graph a second time, but the buffers have already been freed. Specify retain_graph=True when calling backward the first time.
Run Code Online (Sandbox Code Playgroud)
为什么要求我保留图表?如果衍生品被释放,它可能会重新计算它们。为了证明这一点,请考虑以下代码:
for i in range(5):
optimizer.zero_grad()
model.zero_grad() # drop derivatives
y = next_input()
loss = model(y)
loss.backward(retain_graph=True)
optimizer.step()
Run Code Online (Sandbox Code Playgroud)
在这种情况下,前一次迭代的导数也被归零,但 Torch 并不关心,因为标志retain_graph=True已设置。
我说得对吗,取消了(即删除保留衍生品)model.zero_grad()的影响?retain_graph=True