使用Keras，如何将从CuDNNLSTM生成的权重加载到LSTM模型中？

Question

使用Keras，如何将从CuDNNLSTM生成的权重加载到LSTM模型中？

leo*_*ory 5 python neural-network keras tensorflow cudnn

我基于LSTM层使用Keras开发了NN模型。为了提高Paperspace（GPU云处理基础架构）的速度，我已将LSTM层切换为新的CuDNNLSTM层。但是，这仅在具有GPU cuDNN支持的计算机上可用。PS：CuDNNLSTM仅在Keras上可用master，而在最新版本中不可用。

因此，我已经生成了权重，并将其保存到hdf5Cloud上以进行格式化，我想在MacBook上本地使用它们。由于CuDNNLSTM层不可用，因此仅对于本地安装，我已切换回LSTM。

从@fchollet阅读有关CuDNN的这条推文，我认为它会很好，只需将权重读回LSTM模型即可。

但是，当我尝试导入它们时，Keras抛出此错误：

Traceback (most recent call last):
{...}
tensorflow.python.framework.errors_impl.InvalidArgumentError: Dimension 0 in both shapes must be equal, but are 2048 and 4096 for 'Assign_2' (op: 'Assign') with input shapes: [2048], [4096].
{...}
ValueError: Dimension 0 in both shapes must be equal, but are 2048 and 4096 for 'Assign_2' (op: 'Assign') with input shapes: [2048], [4096]

Run Code Online (Sandbox Code Playgroud)

hdf5用h5cat 分析文件，我可以看到两个结构是不同的。

TL; DR

我无法将CuDNNLSTM生成的权重加载到LSTM模型中。我做错事了吗？如何使他们无缝地工作？

这是我的模型：

Traceback (most recent call last):
{...}
tensorflow.python.framework.errors_impl.InvalidArgumentError: Dimension 0 in both shapes must be equal, but are 2048 and 4096 for 'Assign_2' (op: 'Assign') with input shapes: [2048], [4096].
{...}
ValueError: Dimension 0 in both shapes must be equal, but are 2048 and 4096 for 'Assign_2' (op: 'Assign') with input shapes: [2048], [4096]

Run Code Online (Sandbox Code Playgroud)

Answer 1

Yu-*_*ang 5

原因是该CuDNNLSTM层的厚度是的bias两倍LSTM。这是由于cuDNN API的基础实现。您可以将以下方程式（从cuDNN用户指南中复制）与常用的LSTM方程式进行比较：

CuDNN使用两个偏差项，因此偏差权重的数量增加了一倍。将其转换回什么LSTM使用状态，需要对两个偏差项求和。

我已经提交了PR进行转换，并已将其合并。您可以从GitHub安装最新的Keras，并且应该解决重量加载问题。

归档时间：	8 年，4 月前
查看次数：	1812 次
最近记录：	7 年，2 月前