Huggingface BERT 模型的预训练层是否已冻结?

The*_*fer 3 nlp pytorch bert-language-model huggingface-transformers

我使用 Huggingface 中的以下分类模型:

model = AutoModelForSequenceClassification.from_pretrained("dbmdz/bert-base-german-cased", num_labels=2).to(device)
Run Code Online (Sandbox Code Playgroud)

据我了解,这会在预训练模型的末尾添加一个密集层,该模型有 2 个输出节点。但是之前的所有预训练层都被冻结了吗?或者微调时它们也会更新吗?我在文档中找不到有关该信息的信息...

那么我还需要做这样的事情吗?:

for param in model.bert.parameters():
    param.requires_grad = False
Run Code Online (Sandbox Code Playgroud)

cro*_*oik 7

它们没有被冻结。默认情况下所有参数都是可训练的。您还可以通过以下方式检查:

for name, param in model.named_parameters():
    print(name, param.requires_grad)
Run Code Online (Sandbox Code Playgroud)

输出:

bert.embeddings.word_embeddings.weight True
bert.embeddings.position_embeddings.weight True
bert.embeddings.token_type_embeddings.weight True
bert.embeddings.LayerNorm.weight True
bert.embeddings.LayerNorm.bias True
bert.encoder.layer.0.attention.self.query.weight True
bert.encoder.layer.0.attention.self.query.bias True
bert.encoder.layer.0.attention.self.key.weight True
bert.encoder.layer.0.attention.self.key.bias True
bert.encoder.layer.0.attention.self.value.weight True
bert.encoder.layer.0.attention.self.value.bias True
bert.encoder.layer.0.attention.output.dense.weight True
bert.encoder.layer.0.attention.output.dense.bias True
bert.encoder.layer.0.attention.output.LayerNorm.weight True
bert.encoder.layer.0.attention.output.LayerNorm.bias True
bert.encoder.layer.0.intermediate.dense.weight True
bert.encoder.layer.0.intermediate.dense.bias True
bert.encoder.layer.0.output.dense.weight True
bert.encoder.layer.0.output.dense.bias True
bert.encoder.layer.0.output.LayerNorm.weight True
bert.encoder.layer.0.output.LayerNorm.bias True
bert.encoder.layer.1.attention.self.query.weight True
bert.encoder.layer.1.attention.self.query.bias True
bert.encoder.layer.1.attention.self.key.weight True
bert.encoder.layer.1.attention.self.key.bias True
bert.encoder.layer.1.attention.self.value.weight True
bert.encoder.layer.1.attention.self.value.bias True
bert.encoder.layer.1.attention.output.dense.weight True
bert.encoder.layer.1.attention.output.dense.bias True
bert.encoder.layer.1.attention.output.LayerNorm.weight True
bert.encoder.layer.1.attention.output.LayerNorm.bias True
bert.encoder.layer.1.intermediate.dense.weight True
bert.encoder.layer.1.intermediate.dense.bias True
bert.encoder.layer.1.output.dense.weight True
bert.encoder.layer.1.output.dense.bias True
bert.encoder.layer.1.output.LayerNorm.weight True
bert.encoder.layer.1.output.LayerNorm.bias True
bert.encoder.layer.2.attention.self.query.weight True
bert.encoder.layer.2.attention.self.query.bias True
bert.encoder.layer.2.attention.self.key.weight True
bert.encoder.layer.2.attention.self.key.bias True
bert.encoder.layer.2.attention.self.value.weight True
bert.encoder.layer.2.attention.self.value.bias True
bert.encoder.layer.2.attention.output.dense.weight True
bert.encoder.layer.2.attention.output.dense.bias True
bert.encoder.layer.2.attention.output.LayerNorm.weight True
bert.encoder.layer.2.attention.output.LayerNorm.bias True
bert.encoder.layer.2.intermediate.dense.weight True
bert.encoder.layer.2.intermediate.dense.bias True
bert.encoder.layer.2.output.dense.weight True
bert.encoder.layer.2.output.dense.bias True
bert.encoder.layer.2.output.LayerNorm.weight True
bert.encoder.layer.2.output.LayerNorm.bias True
bert.encoder.layer.3.attention.self.query.weight True
bert.encoder.layer.3.attention.self.query.bias True
bert.encoder.layer.3.attention.self.key.weight True
bert.encoder.layer.3.attention.self.key.bias True
bert.encoder.layer.3.attention.self.value.weight True
bert.encoder.layer.3.attention.self.value.bias True
bert.encoder.layer.3.attention.output.dense.weight True
bert.encoder.layer.3.attention.output.dense.bias True
bert.encoder.layer.3.attention.output.LayerNorm.weight True
bert.encoder.layer.3.attention.output.LayerNorm.bias True
bert.encoder.layer.3.intermediate.dense.weight True
bert.encoder.layer.3.intermediate.dense.bias True
bert.encoder.layer.3.output.dense.weight True
bert.encoder.layer.3.output.dense.bias True
bert.encoder.layer.3.output.LayerNorm.weight True
bert.encoder.layer.3.output.LayerNorm.bias True
bert.encoder.layer.4.attention.self.query.weight True
bert.encoder.layer.4.attention.self.query.bias True
bert.encoder.layer.4.attention.self.key.weight True
bert.encoder.layer.4.attention.self.key.bias True
bert.encoder.layer.4.attention.self.value.weight True
bert.encoder.layer.4.attention.self.value.bias True
bert.encoder.layer.4.attention.output.dense.weight True
bert.encoder.layer.4.attention.output.dense.bias True
bert.encoder.layer.4.attention.output.LayerNorm.weight True
bert.encoder.layer.4.attention.output.LayerNorm.bias True
bert.encoder.layer.4.intermediate.dense.weight True
bert.encoder.layer.4.intermediate.dense.bias True
bert.encoder.layer.4.output.dense.weight True
bert.encoder.layer.4.output.dense.bias True
bert.encoder.layer.4.output.LayerNorm.weight True
bert.encoder.layer.4.output.LayerNorm.bias True
bert.encoder.layer.5.attention.self.query.weight True
bert.encoder.layer.5.attention.self.query.bias True
bert.encoder.layer.5.attention.self.key.weight True
bert.encoder.layer.5.attention.self.key.bias True
bert.encoder.layer.5.attention.self.value.weight True
bert.encoder.layer.5.attention.self.value.bias True
bert.encoder.layer.5.attention.output.dense.weight True
bert.encoder.layer.5.attention.output.dense.bias True
bert.encoder.layer.5.attention.output.LayerNorm.weight True
bert.encoder.layer.5.attention.output.LayerNorm.bias True
bert.encoder.layer.5.intermediate.dense.weight True
bert.encoder.layer.5.intermediate.dense.bias True
bert.encoder.layer.5.output.dense.weight True
bert.encoder.layer.5.output.dense.bias True
bert.encoder.layer.5.output.LayerNorm.weight True
bert.encoder.layer.5.output.LayerNorm.bias True
bert.encoder.layer.6.attention.self.query.weight True
bert.encoder.layer.6.attention.self.query.bias True
bert.encoder.layer.6.attention.self.key.weight True
bert.encoder.layer.6.attention.self.key.bias True
bert.encoder.layer.6.attention.self.value.weight True
bert.encoder.layer.6.attention.self.value.bias True
bert.encoder.layer.6.attention.output.dense.weight True
bert.encoder.layer.6.attention.output.dense.bias True
bert.encoder.layer.6.attention.output.LayerNorm.weight True
bert.encoder.layer.6.attention.output.LayerNorm.bias True
bert.encoder.layer.6.intermediate.dense.weight True
bert.encoder.layer.6.intermediate.dense.bias True
bert.encoder.layer.6.output.dense.weight True
bert.encoder.layer.6.output.dense.bias True
bert.encoder.layer.6.output.LayerNorm.weight True
bert.encoder.layer.6.output.LayerNorm.bias True
bert.encoder.layer.7.attention.self.query.weight True
bert.encoder.layer.7.attention.self.query.bias True
bert.encoder.layer.7.attention.self.key.weight True
bert.encoder.layer.7.attention.self.key.bias True
bert.encoder.layer.7.attention.self.value.weight True
bert.encoder.layer.7.attention.self.value.bias True
bert.encoder.layer.7.attention.output.dense.weight True
bert.encoder.layer.7.attention.output.dense.bias True
bert.encoder.layer.7.attention.output.LayerNorm.weight True
bert.encoder.layer.7.attention.output.LayerNorm.bias True
bert.encoder.layer.7.intermediate.dense.weight True
bert.encoder.layer.7.intermediate.dense.bias True
bert.encoder.layer.7.output.dense.weight True
bert.encoder.layer.7.output.dense.bias True
bert.encoder.layer.7.output.LayerNorm.weight True
bert.encoder.layer.7.output.LayerNorm.bias True
bert.encoder.layer.8.attention.self.query.weight True
bert.encoder.layer.8.attention.self.query.bias True
bert.encoder.layer.8.attention.self.key.weight True
bert.encoder.layer.8.attention.self.key.bias True
bert.encoder.layer.8.attention.self.value.weight True
bert.encoder.layer.8.attention.self.value.bias True
bert.encoder.layer.8.attention.output.dense.weight True
bert.encoder.layer.8.attention.output.dense.bias True
bert.encoder.layer.8.attention.output.LayerNorm.weight True
bert.encoder.layer.8.attention.output.LayerNorm.bias True
bert.encoder.layer.8.intermediate.dense.weight True
bert.encoder.layer.8.intermediate.dense.bias True
bert.encoder.layer.8.output.dense.weight True
bert.encoder.layer.8.output.dense.bias True
bert.encoder.layer.8.output.LayerNorm.weight True
bert.encoder.layer.8.output.LayerNorm.bias True
bert.encoder.layer.9.attention.self.query.weight True
bert.encoder.layer.9.attention.self.query.bias True
bert.encoder.layer.9.attention.self.key.weight True
bert.encoder.layer.9.attention.self.key.bias True
bert.encoder.layer.9.attention.self.value.weight True
bert.encoder.layer.9.attention.self.value.bias True
bert.encoder.layer.9.attention.output.dense.weight True
bert.encoder.layer.9.attention.output.dense.bias True
bert.encoder.layer.9.attention.output.LayerNorm.weight True
bert.encoder.layer.9.attention.output.LayerNorm.bias True
bert.encoder.layer.9.intermediate.dense.weight True
bert.encoder.layer.9.intermediate.dense.bias True
bert.encoder.layer.9.output.dense.weight True
bert.encoder.layer.9.output.dense.bias True
bert.encoder.layer.9.output.LayerNorm.weight True
bert.encoder.layer.9.output.LayerNorm.bias True
bert.encoder.layer.10.attention.self.query.weight True
bert.encoder.layer.10.attention.self.query.bias True
bert.encoder.layer.10.attention.self.key.weight True
bert.encoder.layer.10.attention.self.key.bias True
bert.encoder.layer.10.attention.self.value.weight True
bert.encoder.layer.10.attention.self.value.bias True
bert.encoder.layer.10.attention.output.dense.weight True
bert.encoder.layer.10.attention.output.dense.bias True
bert.encoder.layer.10.attention.output.LayerNorm.weight True
bert.encoder.layer.10.attention.output.LayerNorm.bias True
bert.encoder.layer.10.intermediate.dense.weight True
bert.encoder.layer.10.intermediate.dense.bias True
bert.encoder.layer.10.output.dense.weight True
bert.encoder.layer.10.output.dense.bias True
bert.encoder.layer.10.output.LayerNorm.weight True
bert.encoder.layer.10.output.LayerNorm.bias True
bert.encoder.layer.11.attention.self.query.weight True
bert.encoder.layer.11.attention.self.query.bias True
bert.encoder.layer.11.attention.self.key.weight True
bert.encoder.layer.11.attention.self.key.bias True
bert.encoder.layer.11.attention.self.value.weight True
bert.encoder.layer.11.attention.self.value.bias True
bert.encoder.layer.11.attention.output.dense.weight True
bert.encoder.layer.11.attention.output.dense.bias True
bert.encoder.layer.11.attention.output.LayerNorm.weight True
bert.encoder.layer.11.attention.output.LayerNorm.bias True
bert.encoder.layer.11.intermediate.dense.weight True
bert.encoder.layer.11.intermediate.dense.bias True
bert.encoder.layer.11.output.dense.weight True
bert.encoder.layer.11.output.dense.bias True
bert.encoder.layer.11.output.LayerNorm.weight True
bert.encoder.layer.11.output.LayerNorm.bias True
bert.pooler.dense.weight True
bert.pooler.dense.bias True
classifier.weight True
classifier.bias True
Run Code Online (Sandbox Code Playgroud)