使用 .float_val 提取 tensorflow-serving grpc 请求结果的性能非常慢

Den*_*ido 6 python grpc tensorflow tensorflow-serving

出于某种原因，使用 .float_val 提取结果所用的时间非常长。

场景示例及其输出：

t2 = time.time()
options = [('grpc.max_receive_message_length', 100 * 4000 * 4000)]
channel = grpc.insecure_channel('{host}:{port}'.format(host='localhost', port=str(self.serving_grpc_port)), options = options)
stub = prediction_service_pb2_grpc.PredictionServiceStub(channel)
request = predict_pb2.PredictRequest()
request.model_spec.name = 'ivi-detector'
request.model_spec.signature_name = 'serving_default'

request.inputs['inputs'].CopyFrom(tf.make_tensor_proto(imgs_array, shape=imgs_array.shape))
res = stub.Predict(request, 100.0)

print("Time to detect:")
t3 = time.time(); print("t3:", t3 - t2)

t11 = time.time()
boxes_float_val = res.outputs['detection_boxes'].float_val
t12 = time.time(); print("t12:", t12 - t11)
classes_float_val = res.outputs['detection_classes'].float_val
t13 = time.time(); print("t13:", t13 - t12)
scores_float_val = res.outputs['detection_scores'].float_val
t14 = time.time(); print("t14:", t14 - t13)

boxes = np.reshape(boxes_float_val, [len(imgs_array), self.max_total_detections,4])
classes = np.reshape(classes_float_val, [len(imgs_array), self.max_total_detections])
scores = np.reshape(scores_float_val, [len(imgs_array), self.max_total_detections])
t15 = time.time(); print("t15:", t15 - t14)

Run Code Online (Sandbox Code Playgroud)

Time to detect:
t3: 1.4687104225158691
t12: 1.9140026569366455
t13: 3.719329833984375e-05
t14: 9.298324584960938e-06
t15: 0.0008063316345214844

Run Code Online (Sandbox Code Playgroud)

Tensorflow Serving 正在运行来自 tensorflow 的对象检测 api (faster_rncc_resnet101) 的对象检测模型。正如我们所看到的，检测中发现的框的提取率高于预测本身。

检测到的框的当前形状是 [batch_size, 100, 4]，其中 100 是最大检测数。作为一种解决方法，我可以降低最大检测的数量并显着减少提取这些值所需的时间，但它仍然保持不必要的（在我看来）高。

我使用 tensorflow-serving 2.3.0-gpu 作为 docker 容器以及 tensorflow-serving-api==2.3.0

此外，重要的是要通知我尝试在公共保存的模型上重现此行为（纯粹在 imagenet 上训练）并且 .float_val 上的缓慢性能没有发生，指出问题可能特别与我的自定义训练模型有关。我已经尝试以不同的方式从 .ckpt 文件导出保存的模型，但问题仍然存在，如果我对下载的模型使用任何导出方法（下载的模型带有 .ckpt 文件和 save_model 格式文件），问题不会发生，因此导出方法是安全的。

现在我怀疑我训练的模型有问题/不同......但是......为什么？它从 tensorflow-serving-api 影响 .float_val 是否有意义？

我使用的代码（结果很快）：https : //github.com/denisb411/tfserving-od/blob/master/inference-using-tfserving-docker.ipynb

我不知道如何进行，因为我的自定义培训遵循与原始培训几乎相同的 pipeline.config，因此培训过程没有什么不同。

我怎样才能解决这个问题？如果有任何关系，这与 .float_val 有什么关系？

假设这是一个错误，前段时间我创建了一个github issue谈论我遇到的这个问题，但没有得到足够的关注。

归档时间：	5 年，2 月前
查看次数：	482 次
最近记录：	5 年，2 月前