Kru*_*ger 10 nlp machine-learning deep-learning pytorch
层标准化不应该是x = torch.tensor([[1.5,0,0,0,0]])吗[[1.5,-0.5,-0.5,-0.5]]?根据本文和pytorch doc中的方程。但torch.nn.LayerNorm给予[[ 1.7320, -0.5773, -0.5773, -0.5773]]
这是示例代码:
x = torch.tensor([[1.5,.0,.0,.0]])
layerNorm = torch.nn.LayerNorm(4, elementwise_affine = False)
y1 = layerNorm(x)
mean = x.mean(-1, keepdim = True)
var = x.var(-1, keepdim = True)
y2 = (x-mean)/torch.sqrt(var+layerNorm.eps)
Run Code Online (Sandbox Code Playgroud)
在哪里:
y1 == tensor([[ 1.7320, -0.5773, -0.5773, -0.5773]])
y2 == tensor([[ 1.5000, -0.5000, -0.5000, -0.5000]])
Run Code Online (Sandbox Code Playgroud)
小智 5
代替
var = x.var(-1, keepdim = True)
Run Code Online (Sandbox Code Playgroud)
你应该使用
var = x.var(-1, keepdim = True, unbiased=False)
Run Code Online (Sandbox Code Playgroud)
这将产生与 pytorch 相同的结果,完整代码:
x = torch.tensor([[1.5,.0,.0,.0]])
layerNorm = torch.nn.LayerNorm(4, elementwise_affine = False)
y1 = layerNorm(x)
mean = x.mean(-1, keepdim = True)
var = x.var(-1, keepdim = True, unbiased=False)
y2 = (x-mean)/torch.sqrt(var+layerNorm.eps)
Run Code Online (Sandbox Code Playgroud)
显然,代码应该是这样的:
...
var = x.mean((x-mean)**2, -1, keepdim = True)
...
Run Code Online (Sandbox Code Playgroud)
希望这对任何遇到同样错误的人都有帮助。
| 归档时间: |
|
| 查看次数: |
16731 次 |
| 最近记录: |