TensorFlow 1.15 中使用 BiLSTM-CRF 实现 CRF 层

Des*_*eux 5 crf deep-learning lstm keras tensorflow

keras我使用&实现了带有条件随机场层 (BiLSTM-CRF) 的双向长短期记忆神经网络 (BiLSTM-CRF) keras_contrib(后者用于实现 CRF,它不是本机的一部分keras functionality。该任务被命名为实体识别分类为 6 种之一)网络的输入是一系列 300 维预训练的 GloVe 词嵌入。这是我的模型摘要:

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input_1 (InputLayer)         (None, 648)               0   
_________________________________________________________________
embedding_1 (Embedding)      (None, 648, 300)          1500000   
_________________________________________________________________
bidirectional_1 (Bidirection (None, 648, 10000)        3204000   
_________________________________________________________________
crf_1 (CRF)                  (None, 648, 6)            6054      
=================================================================
Run Code Online (Sandbox Code Playgroud)

现在我想在TensorFlow1.15 中实现相同的模型。由于 keras_contrib CRF 模块仅适用于 keras 而不适用于 TensorFlow,因此我使用了TensorFlow存储库中为 1.X构建的 CRF 实现。该存储库包含两个很好的 CRF 示例实现(此处),但在使用我的数据进行训练时,每个示例都会产生不同的错误。

实施1

from tensorflow.keras.layers import Bidirectional, Embedding, LSTM, TimeDistributed
from tensorflow.keras.models import Sequential

from tf_crf_layer.layer import CRF
from tf_crf_layer.loss import crf_loss
from tf_crf_layer.metrics import crf_accuracy

MAX_WORDS = 50000
EMBEDDING_LENGTH = 300
MAX_SEQUENCE_LENGTH = 648
HIDDEN_SIZE = 512

model = Sequential()
model.add(Embedding(MAX_WORDS, EMBEDDING_LENGTH, input_length=MAX_SEQUENCE_LENGTH, mask_zero=True, weights=[embedding_matrix], trainable=False))
model.add(Bidirectional(LSTM(HIDDEN_SIZE, return_sequences=True)))
model.add(CRF(len(labels)))

model.compile('adam', loss=crf_loss, metrics=[crf_accuracy])
Run Code Online (Sandbox Code Playgroud)

这是我尝试编译模型时遇到的错误:

File "/.../tf_crf_layer/metrics/crf_accuracy.py", line 48, in crf_accuracy
    crf, idx = y_pred._keras_history[:2]

AttributeError: 'Tensor' object has no attribute '_keras_history'
Run Code Online (Sandbox Code Playgroud)

crf_accuracy从上述存储库计算时会出现错误。

def crf_accuracy(y_true, y_pred):
    """
    Get default accuracy based on CRF `test_mode`.
    """
    import pdb; pdb.set_trace()
    crf, idx = y_pred._keras_history[:2]
    if crf.test_mode == 'viterbi':
        return crf_viterbi_accuracy(y_true, y_pred)
    else:
        return crf_marginal_accuracy(y_true, y_pred)
Run Code Online (Sandbox Code Playgroud)

显然,根据此线程,当张量对象不是 keras 层的输出时,就会发生这种错误。为什么这个错误会出现在这里?

实施2

from tf_crf_layer.layer import CRF
from tf_crf_layer.loss import crf_loss, ConditionalRandomFieldLoss
from tf_crf_layer.metrics import crf_accuracy
from tf_crf_layer.metrics.sequence_span_accuracy import SequenceSpanAccuracy

model = Sequential()
model.add(Embedding(MAX_WORDS, EMBEDDING_LENGTH, input_length=MAX_SEQUENCE_LENGTH, mask_zero=True, weights=[embedding_matrix], trainable=False))
model.add(Bidirectional(LSTM(HIDDEN_SIZE, return_sequences=True)))
model.add(CRF(len(labels), name="crf_layer"))

model.summary()

crf_loss_instance = ConditionalRandomFieldLoss()  
model.compile(loss={"crf_layer": crf_loss_instance}, optimizer='adam', metrics=[SequenceSpanAccuracy()])
Run Code Online (Sandbox Code Playgroud)

这里模型编译了,但是一旦训练的第一个纪元开始,这个错误就会出现:

InvalidArgumentError: Expected begin and size arguments to be 1-D tensors of size 3, but got shapes [2] and [2] instead.
     [[{{node loss_4/crf_layer_loss/Slice_1}}]]
Run Code Online (Sandbox Code Playgroud)

我正在使用小批量训练模型,这可以解释这个错误吗?我还注意到我的 CRF 层模型摘要缺少维度(比较上面摘要和下面摘要中的 CRF 层规范),尽管该层的参数数量与上面相同。为什么会导致这种不匹配以及如何修复它?

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
embedding_5 (Embedding)      (None, 648, 300)          1500000   
_________________________________________________________________
bidirectional_5 (Bidirection (None, 648, 1000)         3204000   
_________________________________________________________________
crf_layer (CRF)              (None, 648)               6054      
=================================================================
Run Code Online (Sandbox Code Playgroud)