部署在Cloud ML Engine中的重新训练的inception_v3模型始终输出相同的预测

hec*_*rga 4 machine-learning computer-vision google-cloud-platform tensorflow google-cloud-ml

我使用inception_v3 跟随Codelab TensorFlow For Poets进行传输学习.它会生成retrained_graph.pb和retrained_labels.txt文件,这些文件可用于在本地进行预测(运行label_image.py).

然后,我想将此模型部署到Cloud ML Engine,以便我可以进行在线预测.为此,我必须将retrained_graph.pb导出为SavedModel格式.我设法通过遵循指示去做,从谷歌的@ rhaertel80这个答案这条巨蟒文件花云ML引擎教程.这是我的代码:

import tensorflow as tf
from tensorflow.contrib import layers

from tensorflow.python.saved_model import builder as saved_model_builder
from tensorflow.python.saved_model import signature_constants
from tensorflow.python.saved_model import signature_def_utils
from tensorflow.python.saved_model import tag_constants
from tensorflow.python.saved_model import utils as saved_model_utils


export_dir = '../tf_files/saved7'
retrained_graph = '../tf_files/retrained_graph2.pb'
label_count = 5

def build_signature(inputs, outputs):
    signature_inputs = { key: saved_model_utils.build_tensor_info(tensor) for key, tensor in inputs.items() }
    signature_outputs = { key: saved_model_utils.build_tensor_info(tensor) for key, tensor in outputs.items() }

    signature_def = signature_def_utils.build_signature_def(
        signature_inputs,
        signature_outputs,
        signature_constants.PREDICT_METHOD_NAME
    )

    return signature_def

class GraphReferences(object):
  def __init__(self):
    self.examples = None
    self.train = None
    self.global_step = None
    self.metric_updates = []
    self.metric_values = []
    self.keys = None
    self.predictions = []
    self.input_jpeg = None

class Model(object):
    def __init__(self, label_count):
        self.label_count = label_count

    def build_image_str_tensor(self):
        image_str_tensor = tf.placeholder(tf.string, shape=[None])

        def decode_and_resize(image_str_tensor):
            return image_str_tensor

        image = tf.map_fn(
            decode_and_resize,
            image_str_tensor,
            back_prop=False,
            dtype=tf.string
        )

        return image_str_tensor

    def build_prediction_graph(self, g):
        tensors = GraphReferences()
        tensors.examples = tf.placeholder(tf.string, name='input', shape=(None,))
        tensors.input_jpeg = self.build_image_str_tensor()

        keys_placeholder = tf.placeholder(tf.string, shape=[None])
        inputs = {
            'key': keys_placeholder,
            'image_bytes': tensors.input_jpeg
        }

        keys = tf.identity(keys_placeholder)
        outputs = {
            'key': keys,
            'prediction': g.get_tensor_by_name('final_result:0')
        }

        return inputs, outputs

    def export(self, output_dir):
        with tf.Session(graph=tf.Graph()) as sess:
            with tf.gfile.GFile(retrained_graph, "rb") as f:
                graph_def = tf.GraphDef()
                graph_def.ParseFromString(f.read())
                tf.import_graph_def(graph_def, name="")

            g = tf.get_default_graph()
            inputs, outputs = self.build_prediction_graph(g)

            signature_def = build_signature(inputs=inputs, outputs=outputs)
            signature_def_map = {
                signature_constants.DEFAULT_SERVING_SIGNATURE_DEF_KEY: signature_def
            }

            builder = saved_model_builder.SavedModelBuilder(output_dir)
            builder.add_meta_graph_and_variables(
                sess,
                tags=[tag_constants.SERVING],
                signature_def_map=signature_def_map
            )
            builder.save()

model = Model(label_count)
model.export(export_dir)
Run Code Online (Sandbox Code Playgroud)

此代码生成saved_model.pb文件,然后我使用该文件创建Cloud ML Engine模型.我可以使用这个模型得到预测gcloud ml-engine predict --model my_model_name --json-instances request.json,其中request.json的内容是:

{ "key": "0", "image_bytes": { "b64": "jpeg_image_base64_encoded" } }
Run Code Online (Sandbox Code Playgroud)

但是,无论我在请求中编码哪个jpeg,我总是得到完全相同的错误预测:

预测输出

我想问题就在于CloudML Prediction API将base64编码的图像字节传递给inception_v3(前一代码中的"build_image_str_tensor()"方法的输入张量"DecodeJpeg/contents:0").有关如何解决此问题并让我的本地重新训练的模型在Cloud ML Engine上提供正确预测的任何线索?

(只是为了说明问题,问题不在retrained_graph.pb中,因为它在我本地运行时做出了正确的预测;也不在request.json中,因为在遵循Flowers Cloud ML Engine时相同的请求文件没有问题教程指出上面.)

rha*_*l80 5

首先,一般警告.TensorFlow for Poets codelab并非以非常适合生产服务的方式编写(部分表现为您必须实现的变通方法).您通常会导出一个不包含所有额外培训操作的特定于预测的图表.因此,虽然我们可以尝试将某些有效的东西组合在一起,但可能需要额外的工作才能生成此图.

代码的方法似乎是导入一个图,添加一些占位符,然后导出结果.这通常很好.但是,在问题中显示的代码中,您要添加输入占位符而不实际将它们连接到导入图形中的任何内容.您最终会得到一个包含多个断开连接的子图的图形,例如(原理图):

image_str_tensor [input=image_bytes] -> <nothing>
keys_placeholder [input=key]  -> identity [output=key]
inception_subgraph -> final_graph [output=prediction]
Run Code Online (Sandbox Code Playgroud)

通过inception_subgraph我的意思是所有你要导入的OPS的.

所以image_bytes实际上是无操作而被忽略; key通过; 并prediction包含运行结果inception_subgraph; 因为它没有使用你传递的输入,所以它每次返回相同的结果(虽然我承认我实际上预计会出现错误).

要解决此问题,我们需要将您创建的占位符连接到已存在的占位符,inception_subgraph以创建或多或少的图形:

image_str_tensor [input=image_bytes] -> inception_subgraph -> final_graph [output=prediction]
keys_placeholder [input=key]  -> identity [output=key]   
Run Code Online (Sandbox Code Playgroud)

请注意,image_str_tensor根据预测服务的要求,它将是一批图像,但初始图形的输入实际上是单个图像.为了简单起见,我们将以一种愚蠢的方式解决这个问题:我们假设我们将逐个发送图像.如果我们每次请求发送多个图像,我们就会收到错误.此外,批量预测永远不会起作用.

您需要的主要更改是import语句,它将我们添加的占位符连接到图形中的现有输入(您还将看到用于更改输入形状的代码):

把它们放在一起,我们得到类似的东西:

import tensorflow as tf
from tensorflow.contrib import layers

from tensorflow.python.saved_model import builder as saved_model_builder
from tensorflow.python.saved_model import signature_constants
from tensorflow.python.saved_model import signature_def_utils
from tensorflow.python.saved_model import tag_constants
from tensorflow.python.saved_model import utils as saved_model_utils


export_dir = '../tf_files/saved7'
retrained_graph = '../tf_files/retrained_graph2.pb'
label_count = 5

class Model(object):
    def __init__(self, label_count):
        self.label_count = label_count

    def build_prediction_graph(self, g):
        inputs = {
            'key': keys_placeholder,
            'image_bytes': tensors.input_jpeg
        }

        keys = tf.identity(keys_placeholder)
        outputs = {
            'key': keys,
            'prediction': g.get_tensor_by_name('final_result:0')
        }

        return inputs, outputs

    def export(self, output_dir):
        with tf.Session(graph=tf.Graph()) as sess:
            # This will be our input that accepts a batch of inputs
            image_bytes = tf.placeholder(tf.string, name='input', shape=(None,))
            # Force it to be a single input; will raise an error if we send a batch.
            coerced = tf.squeeze(image_bytes)
            # When we import the graph, we'll connect `coerced` to `DecodeJPGInput:0`
            input_map = {'DecodeJPGInput:0': coerced}

            with tf.gfile.GFile(retrained_graph, "rb") as f:
                graph_def = tf.GraphDef()
                graph_def.ParseFromString(f.read())
                tf.import_graph_def(graph_def, input_map=input_map, name="")

            keys_placeholder = tf.placeholder(tf.string, shape=[None])

            inputs = {'image_bytes': image_bytes, 'key': keys_placeholder}

            keys = tf.identity(keys_placeholder)
            outputs = {
                'key': keys,
                'prediction': tf.get_default_graph().get_tensor_by_name('final_result:0')}    
            }

            tf.simple_save(sess, output_dir, inputs, outputs)

model = Model(label_count)
model.export(export_dir)
Run Code Online (Sandbox Code Playgroud)