在tensorflow中使用转移学习是否需要具有预定义的图像大小?

add*_*lor 2 image-size tensorflow tensorflow-datasets transfer-learning

我打算用预先训练模式一样 faster_rcnn_resnet101_pets在Tensorflow环境物体检测所描述的在这里

我已经收集了几张图像用于训练和测试集。所有这些图像大小各异。是否需要将它们调整为通用尺寸?

faster_rcnn_resnet101_pets使用输入大小为224x224x3的resnet。 在此处输入图片说明

这是否意味着我必须在发送训练之前重新调整所有图像的大小?或TF自动照顾它。

python train.py --logtostderr --train_dir=training/ --pipeline_config_path=training/faster_rcnn_resnet101_pets.config
Run Code Online (Sandbox Code Playgroud)

通常,拥有相同尺寸的图像是一种好习惯吗?

dan*_*ang 5

不,您不需要自己将输入图像调整为固定形状。 Tensorflow对象检测api有一个处理前的步骤,它将调整所有输入图像的大小。以下是在预处理步骤中定义的函数,有一个image_resizer_fn,它对应image_resizer于配置文件中命名的字段。

def transform_input_data(tensor_dict,
                     model_preprocess_fn,
                     image_resizer_fn,
                     num_classes,
                     data_augmentation_fn=None,
                     merge_multiple_boxes=False,
                     retain_original_image=False,
                     use_multiclass_scores=False,
                     use_bfloat16=False):


"""A single function that is responsible for all input data transformations.
  Data transformation functions are applied in the following order.
  1. If key fields.InputDataFields.image_additional_channels is present in
     tensor_dict, the additional channels will be merged into
     fields.InputDataFields.image.
  2. data_augmentation_fn (optional): applied on tensor_dict.
  3. model_preprocess_fn: applied only on image tensor in tensor_dict.
  4. image_resizer_fn: applied on original image and instance mask tensor in
     tensor_dict.
  5. one_hot_encoding: applied to classes tensor in tensor_dict.
  6. merge_multiple_boxes (optional): when groundtruth boxes are exactly the
     same they can be merged into a single box with an associated k-hot class
     label.
Run Code Online (Sandbox Code Playgroud)

根据原始文件,您可以在4种不同的图像缩放器中进行选择,即

  1. keep_aspect_ratio_resizer
  2. fixed_shape_resizer
  3. identity_resizer
  4. conditional_shape_resizer

是模型的示例配置文件,faster_rcnn_resnet101_pets并且所有图像均使用min_dimension = 600和max_dimension = 1024进行了整形

model {
  faster_rcnn {
    num_classes: 37
    image_resizer {
      keep_aspect_ratio_resizer {
        min_dimension: 600
        max_dimension: 1024
      }
    }
    feature_extractor {
      type: 'faster_rcnn_resnet101'
      first_stage_features_stride: 16
    }
Run Code Online (Sandbox Code Playgroud)

实际上,调整大小后的图像的形状对检测速度与准确性的影响很大。尽管对输入图像的大小没有特殊要求,但最好使所有最小尺寸的图像大于合理值,以使卷积运算正常工作。