tf object detection api - 为每个检测 bbox 提取特征向量

Question

tf object detection api - 为每个检测 bbox 提取特征向量

dot*_*nnn 8 object-detection tensorflow object-detection-api tensorflow-slim

我正在使用 Tensorflow 对象检测 API 并研究 pretrainedd ssd-mobilenet 模型。有没有办法为每个 bbox 提取移动网络的最后一个全局池作为特征向量？我找不到保存此信息的操作的名称。

我已经能够根据 github 上的示例提取检测标签和 bbox：

 image_tensor = detection_graph.get_tensor_by_name( 'image_tensor:0' )
 # Each box represents a part of the image where a particular object was detected.
 detection_boxes = detection_graph.get_tensor_by_name( 'detection_boxes:0' )
 # Each score represent how level of confidence for each of the objects.
 # Score is shown on the result image, together with the class label.
 detection_scores = detection_graph.get_tensor_by_name( 'detection_scores:0' )
 detection_classes = detection_graph.get_tensor_by_name( 'detection_classes:0' )
 num_detections = detection_graph.get_tensor_by_name( 'num_detections:0' )
 #TODO: add also the feature vector output

 # Actual detection.
 (boxes, scores, classes, num) = sess.run(
                [detection_boxes, detection_scores, detection_classes, num_detections],
                feed_dict={image_tensor: image_np_expanded} )

Run Code Online (Sandbox Code Playgroud)

Answer 1

小智 5

正如史蒂夫所说，对象检测 api 中 Faster RCNN 中的特征向量似乎在 SecondStageBoxPredictor 之后被删除。我能够通过修改 core/box_predictor.py 和 meta_architectures/faster_rcnn_meta_arch.py 将它们穿过网络。

关键是非最大抑制代码实际上有一个用于 additional_fields 的参数（请参阅 master 上的 core/post_processing.py:176）。您可以传递在前两个维度中与框和分数具有相同形状的张量字典，该函数将返回它们以与框和分数相同的方式过滤。这是我所做的更改与 master 的差异：

https://gist.github.com/donniet/c95d19e00ff9abeb786415b3a9348e62

然后我不得不重建网络并从这样的检查点加载变量而不是加载冻结图（注意：我从这里下载了更快的 rcnn 检查点：http : //download.tensorflow.org/models/object_detection/faster_rcnn_resnet101_coco_2018_01_28 .tar.gz )

import sys
import os
import numpy as np

from object_detection.builders import model_builder
from object_detection.protos import pipeline_pb2

from google.protobuf import text_format
import tensorflow as tf

# load the pipeline structure from the config file
with open('object_detection/samples/configs/faster_rcnn_resnet101_coco.config', 'r') as content_file:
    content = content_file.read()

# build the model with model_builder
pipeline_proto = pipeline_pb2.TrainEvalPipelineConfig()
text_format.Merge(content, pipeline_proto)
model = model_builder.build(pipeline_proto.model, is_training=False)

# construct a network using the model
image_placeholder = tf.placeholder(shape=(None,None,3), dtype=tf.uint8, name='input')
original_image = tf.expand_dims(image_placeholder, 0)
preprocessed_image, true_image_shapes = model.preprocess(tf.to_float(original_image))
prediction_dict = model.predict(preprocessed_image, true_image_shapes)
detections = model.postprocess(prediction_dict, true_image_shapes)

# create an input network to read a file
filename_placeholder = tf.placeholder(name='file_name', dtype=tf.string)
image_file = tf.read_file(filename_placeholder)
image_data = tf.image.decode_image(image_file)

# load the variables from a checkpoint
init_saver = tf.train.Saver()
sess = tf.Session()
init_saver.restore(sess, 'object_detection/faster_rcnn_resnet101_coco_11_06_2017/model.ckpt')

# get the image data
blob = sess.run(image_data, feed_dict={filename_placeholder:'image.jpeg'})
# process the inference
output = sess.run(detections, feed_dict={image_placeholder:blob})

# get the shape of the image_features
print(output['image_features'].shape)

Run Code Online (Sandbox Code Playgroud)

警告：我没有针对我所做的更改运行 tensorflow 单元测试，因此仅出于演示目的考虑它们，并且应该进行更多测试以确保它们不会破坏对象检测 api 中的其他内容。

Answer 2

Ste*_*ley 3

诚然，这不是一个完美的答案，但我已经使用 TF-OD API 对 Faster-RCNN 进行了大量深入研究，并在这个问题上取得了一些进展。我将解释我通过深入研究 Faster-RCNN 版本所了解到的内容，希望您能将其转换为 SSD。最好的办法是深入 TensorBoard 上的图表并筛选检测图表中的张量名称。

首先，特征和框/分数之间并不总是存在简单的一一对应关系。也就是说，没有一个简单的张量可以从网络中提取来提供此功能，至少默认情况下是这样。

以下是从 Faster-RCNN 网络获取特征的代码：

https://gist.github.com/markdtw/02ece6b90e75832bd44787c03a664e8d

尽管这提供了看起来像特征向量的东西，但您可以看到还有一些其他人在使用此解决方案时遇到了麻烦。根本问题是，特征向量是在 SecondStagePostprocessor 之前提取的，SecondStagePostprocessor 在detection_boxes创建张量和类似的张量之前执行多项操作。

在 SecondStagePostprocessor 之前，会创建类别分数和框，并且留下特征向量，再也不会看到。在后处理器中，有一个多类 NMS 阶段和一个排序阶段。最终结果是 MaxProposalsFromSecondStage，而特征向量填充为 [MaxProposalsFromFirstStage, NumberOfFeatureVectors]。因此，存在抽取和排序操作，这使得将最终输出与特征向量索引配对变得困难。

我当前的解决方案是从第二阶段之前提取特征向量和框，然后手动完成其余部分。毫无疑问，有比这更好的解决方案，但很难跟踪图表并找到适合排序操作的张量。

我希望这可以帮助你！抱歉，我无法为您提供端到端的解决方案，但我希望这可以帮助您克服当前的障碍。

归档时间：	7 年，7 月前
查看次数：	2411 次
最近记录：	4 年，12 月前