无法使用 keras.models.load_model() 加载 TF 变压器模型

Tre*_*ich 5 python keras tensorflow huggingface-transformers

我有一个在 sagemaker(自定义训练作业)中训练的模型,并通过我的训练脚本使用 kerasmodel.save()方法保存,该方法生成一个variables包含权重和索引的目录以及一个.pb文件。该模型来自TFBertForSequenceClassificationHuggingface 的transformer库,根据他们的文档,该模型是 keras 模型的子类。但是,当我尝试加载模型时keras.models.load_model(),出现以下错误:

Traceback (most recent call last):
  File "<input>", line 1, in <module>
  File "/home/tyarosevich/anaconda3/envs/fresh_env/lib/python3.7/site-packages/tensorflow/python/keras/saving/save.py", line 187, in load_model
    return saved_model_load.load(filepath, compile, options)
  File "/home/tyarosevich/anaconda3/envs/fresh_env/lib/python3.7/site-packages/tensorflow/python/keras/saving/saved_model/load.py", line 121, in load
    path, options=options, loader_cls=KerasObjectLoader)
  File "/home/tyarosevich/anaconda3/envs/fresh_env/lib/python3.7/site-packages/tensorflow/python/saved_model/load.py", line 633, in load_internal
    ckpt_options)
  File "/home/tyarosevich/anaconda3/envs/fresh_env/lib/python3.7/site-packages/tensorflow/python/keras/saving/saved_model/load.py", line 194, in __init__
    super(KerasObjectLoader, self).__init__(*args, **kwargs)
  File "/home/tyarosevich/anaconda3/envs/fresh_env/lib/python3.7/site-packages/tensorflow/python/saved_model/load.py", line 130, in __init__
    self._load_all()
  File "/home/tyarosevich/anaconda3/envs/fresh_env/lib/python3.7/site-packages/tensorflow/python/keras/saving/saved_model/load.py", line 215, in _load_all
    self._layer_nodes = self._load_layers()
  File "/home/tyarosevich/anaconda3/envs/fresh_env/lib/python3.7/site-packages/tensorflow/python/keras/saving/saved_model/load.py", line 315, in _load_layers
    layers[node_id] = self._load_layer(proto.user_object, node_id)
  File "/home/tyarosevich/anaconda3/envs/fresh_env/lib/python3.7/site-packages/tensorflow/python/keras/saving/saved_model/load.py", line 341, in _load_layer
    obj, setter = self._revive_from_config(proto.identifier, metadata, node_id)
  File "/home/tyarosevich/anaconda3/envs/fresh_env/lib/python3.7/site-packages/tensorflow/python/keras/saving/saved_model/load.py", line 368, in _revive_from_config
    obj, self._proto.nodes[node_id], node_id)
  File "/home/tyarosevich/anaconda3/envs/fresh_env/lib/python3.7/site-packages/tensorflow/python/keras/saving/saved_model/load.py", line 298, in _add_children_recreated_from_config
    obj_child, child_proto, child_id)
  File "/home/tyarosevich/anaconda3/envs/fresh_env/lib/python3.7/site-packages/tensorflow/python/keras/saving/saved_model/load.py", line 298, in _add_children_recreated_from_config
    obj_child, child_proto, child_id)
  File "/home/tyarosevich/anaconda3/envs/fresh_env/lib/python3.7/site-packages/tensorflow/python/keras/saving/saved_model/load.py", line 250, in _add_children_recreated_from_config
    metadata = json_utils.decode(proto.user_object.metadata)
  File "/home/tyarosevich/anaconda3/envs/fresh_env/lib/python3.7/site-packages/tensorflow/python/keras/saving/saved_model/json_utils.py", line 60, in decode
    return json.loads(json_string, object_hook=_decode_helper)
  File "/home/tyarosevich/anaconda3/envs/fresh_env/lib/python3.7/json/__init__.py", line 361, in loads
    return cls(**kw).decode(s)
  File "/home/tyarosevich/anaconda3/envs/fresh_env/lib/python3.7/json/decoder.py", line 337, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/home/tyarosevich/anaconda3/envs/fresh_env/lib/python3.7/json/decoder.py", line 355, in raw_decode
    raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
Run Code Online (Sandbox Code Playgroud)

我很困惑。Transformer 库自己的save_pretrained()方法将层信息保存在.json文件中,但我不明白为什么 keras 模型保存会知道/关心这一点(而且我不认为这就是问题所在)。有什么帮助吗?

小智 0

另一种选择是使用第一个转换器层构建您自己的分类器,并将您的分类器(和输出)放在其顶部。然后按以下方式使用 model.save() 和 tf.keras.models.load_model(model_path) :

重要(!)-注意第一层的用法:感谢 Utpal Chakraborty 贡献了一个解决方案:Isues with saving and load tensorflow model that using Hugging Face Transformer Model 作为第一层

import tensorflow as tf
from tensorflow.keras import Model
from transformers import (
    AutoConfig,
    AutoTokenizer,
    TFAutoModel)
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.layers import Dense, Input

#use GPU
gpus = tf.config.experimental.list_physical_devices('GPU')
print(gpus)
if gpus:
  try:
    # Currently, memory growth needs to be the same across GPUs
    for gpu in gpus:
      tf.config.experimental.set_memory_growth(gpu, True)
    logical_gpus = tf.config.experimental.list_logical_devices('GPU')
    print(len(gpus), "Physical GPUs,", len(logical_gpus), "Logical GPUs")
  except RuntimeError as e:
    # Memory growth must be set before GPUs have been initialized
    print(e)

config = AutoConfig.from_pretrained('bert-base-uncased',output_hidden_states=True, num_labels=4)

tokenizer = AutoTokenizer.from_pretrained('bert-base-uncased')

transformer_layer = TFAutoModel.from_pretrained('bert-base-uncased',config=config, from_pt=False)

# optional - freeze all layers:
for layer in transformer_layer.layers:
    layer._trainable = False

input_word_ids = Input(shape=(512,), dtype=tf.int32, name="input_ids")
mask = Input(shape=(512,), dtype=tf.int32, name="attention_mask")

#note this critical call to inner model layer
embedding = transformer_layer.bert(input_word_ids, mask)[0]

#take only the CLS embedding
hidden = tf.keras.layers.Dense(768, activation='relu')(embedding[:,0,:])

out = Dense(num_labels, activation='softmax')(hidden)

#Compile model
model = Model(inputs = [input_word_ids,mask], outputs=out)
print(model.summary())

optimizer = Adam(learning_rate=5e-05)
metric = tf.keras.metrics.CategoricalAccuracy('accuracy')
model.compile(optimizer=optimizer, loss=tf.keras.losses.CategoricalCrossentropy(), metrics=[metric])

#Then fit the model
#.....

#Now save
model_dir = './tmp/model'
model.save(model_dir)

#test it:
model = tf.keras.models.load_model(model_dir)

Run Code Online (Sandbox Code Playgroud)