ink*_*kzk 6 python computer-vision deep-learning conv-neural-network pytorch
下面的代码(取自此处)似乎只实现了一个简单的Dropout,既不是DropPath也不是DropConnect。真的吗?
def drop_path(x, drop_prob: float = 0., training: bool = False):
"""Drop paths (Stochastic Depth) per sample (when applied in main path of residual blocks).
This is the same as the DropConnect impl I created for EfficientNet, etc networks, however,
the original name is misleading as 'Drop Connect' is a different form of dropout in a separate paper...
See discussion: https://github.com/tensorflow/tpu/issues/494#issuecomment-532968956 ... I've opted for
changing the layer and argument names to 'drop path' rather than mix DropConnect as a layer name and use
'survival rate' as the argument.
"""
if drop_prob == 0. or not training:
return x
keep_prob = 1 - drop_prob
shape = (x.shape[0],) + (1,) * (x.ndim - 1) # work with diff dim tensors, not just 2D ConvNets
random_tensor = keep_prob + torch.rand(shape, dtype=x.dtype, device=x.device)
random_tensor.floor_() # binarize
output = x.div(keep_prob) * random_tensor
return output
Run Code Online (Sandbox Code Playgroud)
Ber*_*iel 11
不,它不同于Dropout:
import torch
from torch.nn.functional import dropout
torch.manual_seed(2021)
def drop_path(x, drop_prob: float = 0., training: bool = False):
if drop_prob == 0. or not training:
return x
keep_prob = 1 - drop_prob
shape = (x.shape[0],) + (1,) * (x.ndim - 1)
random_tensor = keep_prob + torch.rand(shape, dtype=x.dtype, device=x.device)
random_tensor.floor_() # binarize
output = x.div(keep_prob) * random_tensor
return output
x = torch.rand(3, 2, 2, 2)
# DropPath
d1_out = drop_path(x, drop_prob=0.33, training=True)
# Dropout
d2_out = dropout(x, p=0.33, training=True)
Run Code Online (Sandbox Code Playgroud)
让我们比较输出(为了便于阅读,我删除了通道维度之间的换行符):
# DropPath
print(d1_out)
# tensor([[[[0.1947, 0.7662],
# [1.1083, 1.0685]],
# [[0.8515, 0.2467],
# [0.0661, 1.4370]]],
#
# [[[0.0000, 0.0000],
# [0.0000, 0.0000]],
# [[0.0000, 0.0000],
# [0.0000, 0.0000]]],
#
# [[[0.7658, 0.4417],
# [1.1692, 1.1052]],
# [[1.2014, 0.4532],
# [1.4840, 0.7499]]]])
# Dropout
print(d2_out)
# tensor([[[[0.1947, 0.7662],
# [1.1083, 1.0685]],
# [[0.8515, 0.2467],
# [0.0661, 1.4370]]],
#
# [[[0.0000, 0.1480],
# [1.2083, 0.0000]],
# [[1.2272, 0.1853],
# [0.0000, 0.5385]]],
#
# [[[0.7658, 0.0000],
# [1.1692, 1.1052]],
# [[1.2014, 0.4532],
# [0.0000, 0.7499]]]])
Run Code Online (Sandbox Code Playgroud)
正如您所看到的,它们是不同的。DropPath正在从批次中删除整个样本,这在如等式中使用时有效地产生随机深度。2 他们的论文。另一方面,按照预期(来自文档Dropout)删除随机值:
在训练期间,使用伯努利分布中的样本以概率将输入张量的某些元素随机归零。
p每个通道将在每次前转呼叫时独立清零。
另请注意,两者都根据概率缩放输出值,即,对于相同的 ,非清零元素是相同的p。
| 归档时间: |
|
| 查看次数: |
8487 次 |
| 最近记录: |