完成 GeneratorDataset 迭代器时出错：已取消：操作已取消

Question

完成 GeneratorDataset 迭代器时出错：已取消：操作已取消

Rad*_*dhi 17 kubeflow tensorflow2.0 kubeflow-pipelines

在运行具有使用 tensorflow 2.0 的代码的 kubeflow 管道时。以下错误显示在每个时期的末尾

W tensorflow/core/kernels/data/generator_dataset_op.cc:103] 最终确定 GeneratorDataset 迭代器时发生错误：已取消：操作已取消

此外，经过一些时期后，它不显示日志并显示此错误

此步骤处于失败状态并显示以下消息：节点资源不足：内存。容器 main 使用 100213872Ki，超过其请求 0。容器等待使用 25056Ki，超过其请求 0。

Answer 1

小智 5

就我而言，我没有匹配batch_size和steps_per_epoch

例如，

his = Test_model.fit_generator(datagen.flow(trainrancrop_images, trainrancrop_labels, batch_size=batchsize),
                               steps_per_epoch=len(trainrancrop_images)/batchsize,
                               validation_data=(test_images, test_labels),
                               epochs=1,
                               callbacks=[callback])

Run Code Online (Sandbox Code Playgroud)

batch_sizedatagen.flow中的必须对应steps_per_epochTest_model.fit_generator中的（其实是我用错了的值steps_per_epoch）

我猜这是错误的情况之一。

结果，我认为当批量大小和步骤（迭代）有错误对应时会出现问题

当您通过除法获得步骤时，浮点数可能会成为问题......

检查有关此问题的代码。

祝你好运：）