ValueError: 找不到匹配的函数来调用从 SavedModel 加载

Question

ValueError: 找不到匹配的函数来调用从 SavedModel 加载

Tax*_*xel 6 python tensorflow tensorflow-agents

我正在尝试加载tf-agents我通过以下方式保存的策略

try:
    PolicySaver(collect_policy).save(model_dir + 'collect_policy')
except TypeError:
    tf.saved_model.save(collect_policy, model_dir + 'collect_policy')

Run Code Online (Sandbox Code Playgroud)

try/except 块的快速解释：最初创建策略时，我可以通过保存它PolicySaver，但是当我再次加载它以进行另一次训练运行时，它是一个SavedModel，因此无法通过保存PolicySaver。

这似乎工作正常，但现在我想使用此策略进行自我播放，因此我self.policy = tf.saved_model.load(policy_path)在我的 AIPlayer 类中加载了该策略。但是，当我尝试将其用于预测时，它不起作用。这是（测试）代码：

def decide(self, table):
    state = table.getState()
    timestep = ts.restart(np.array([table.getState()], dtype=np.float))
    prediction = self.policy.action(timestep)
    print(prediction)

Run Code Online (Sandbox Code Playgroud)

在table传递给函数包含了游戏的状态和ts.restart()功能是从我的自定义pyEnvironment拷贝，因此时间步长的构造完全相同的方式，因为它会在环境中。但是，我收到该行的以下错误消息prediction=self.policy.action(timestep)：

ValueError: Could not find matching function to call loaded from the SavedModel. Got:
  Positional arguments (2 total):
    * TimeStep(step_type=<tf.Tensor 'time_step:0' shape=() dtype=int32>, reward=<tf.Tensor 'time_step_1:0' shape=() dtype=float32>, discount=<tf.Tensor 'time_step_2:0' shape=() dtype=float32>, observation=<tf.Tensor 'time_step_3:0' shape=(1, 79) dtype=float64>)
    * ()
  Keyword arguments: {}

Expected these arguments to match one of the following 2 option(s):

Option 1:
  Positional arguments (2 total):
    * TimeStep(step_type=TensorSpec(shape=(None,), dtype=tf.int32, name='time_step/step_type'), reward=TensorSpec(shape=(None,), dtype=tf.float32, name='time_step/reward'), discount=TensorSpec(shape=(None,), dtype=tf.float32, name='time_step/discount'), observation=TensorSpec(shape=(None,
79), dtype=tf.float64, name='time_step/observation'))
    * ()
  Keyword arguments: {}

Option 2:
  Positional arguments (2 total):
    * TimeStep(step_type=TensorSpec(shape=(None,), dtype=tf.int32, name='step_type'), reward=TensorSpec(shape=(None,), dtype=tf.float32, name='reward'), discount=TensorSpec(shape=(None,), dtype=tf.float32, name='discount'), observation=TensorSpec(shape=(None, 79), dtype=tf.float64, name='observation'))
    * ()
  Keyword arguments: {}

Run Code Online (Sandbox Code Playgroud)

我究竟做错了什么？真的只是张量名称还是形状问题，我该如何改变？

任何关于如何进一步调试的想法表示赞赏。

Answer 1

Tax*_*xel 5

我通过手动构建 TimeStep 使其工作：

    step_type = tf.convert_to_tensor(
        [0], dtype=tf.int32, name='step_type')
    reward = tf.convert_to_tensor(
        [0], dtype=tf.float32, name='reward')
    discount = tf.convert_to_tensor(
        [1], dtype=tf.float32, name='discount')
    observations = tf.convert_to_tensor(
        [state], dtype=tf.float64, name='observations')
    timestep = ts.TimeStep(step_type, reward, discount, observations)
    prediction = self.policy.action(timestep)

Run Code Online (Sandbox Code Playgroud)

归档时间：	6 年前
查看次数：	3037 次
最近记录：	6 年前