我正在尝试使用.config文件中的data_augmentation_options来训练网络,特别是ssd_mobilenet_v1,但是当我激活选项random_adjust_brightness时,我会很快得到下面粘贴的错误消息(我在步骤110000之后激活该选项).
我尝试减少默认值:
optional float max_delta=1 [default=0.2];
Run Code Online (Sandbox Code Playgroud)
但结果是一样的.
知道为什么吗?图像是来自png文件的RGB(来自博世小交通灯数据集).
INFO:tensorflow:global step 110011: loss = 22.7990 (0.357 sec/step)
INFO:tensorflow:global step 110012: loss = 47.8811 (0.401 sec/step)
2017-11-16 11:02:29.114785: W tensorflow/core/framework/op_kernel.cc:1192] Invalid argument: LossTensor is inf or nan. : Tensor had NaN values
[[Node: CheckNumerics = CheckNumerics[T=DT_FLOAT, message="LossTensor is inf or nan.", _device="/job:localhost/replica:0/task:0/device:CPU:0"](total_loss)]]
2017-11-16 11:02:29.114895: W tensorflow/core/framework/op_kernel.cc:1192] Invalid argument: LossTensor is inf or nan. : Tensor had NaN values
[[Node: CheckNumerics = CheckNumerics[T=DT_FLOAT, message="LossTensor is inf or nan.", _device="/job:localhost/replica:0/task:0/device:CPU:0"](total_loss)]]
2017-11-16 11:02:29.114969: …Run Code Online (Sandbox Code Playgroud) 我正在尝试使用对象检测API的数据扩充功能,特别是random_image_scale.
挖了一下我找到了实现它的功能(粘贴在下面).我遗失了一些东西,或者盒子的基本事实没有在这里处理?我环顾四周,没找到任何东西.如果根据对图像进行缩放而没有相应地修改基础事实,那么它将会被训练的模型搞砸,不是吗?
如果我遗漏了某些内容,请告诉我,否则我应该避免使用此功能来训练我的网络.
该文件是/object_detection/core/preprocessor.py
def random_image_scale(image,
masks=None,
min_scale_ratio=0.5,
max_scale_ratio=2.0,
seed=None):
"""Scales the image size.
Args:
image: rank 3 float32 tensor contains 1 image -> [height, width, channels].
masks: (optional) rank 3 float32 tensor containing masks with
size [height, width, num_masks]. The value is set to None if there are no
masks.
min_scale_ratio: minimum scaling ratio.
max_scale_ratio: maximum scaling ratio.
seed: random seed.
Returns:
image: image which is the same rank as input image.
masks: If masks is not none, resized …Run Code Online (Sandbox Code Playgroud) 我无法匹配Google报告的模型动物园中发布的模型的推断时间.具体来说,我正在尝试他们的faster_rcnn_resnet101_coco模型,其中报告的推理时间是106ms在Titan X GPU上.
我的服务系统使用TF 1.4在由Google发布的Dockerfile构建的容器中运行.我的客户端是在Google发布的初始客户端之后建模的.
我正在运行Ubuntu 14.04,TF 1.4和1 Titan X.我的总推理时间比谷歌报告的差330倍~330ms.制作张量原型需要~150ms,Predict需要~180ms.我saved_model.pb是直接从模型动物园下载的tar文件.有什么我想念的吗?我可以采取哪些步骤来缩短推理时间?
object-detection tensorflow tensorflow-serving tensorflow-gpu object-detection-api
Tensorflow对象检测API培训完美无瑕,但当我尝试使用以下命令评估eval.py的工作时,
python3 eval.py --logtosderr --checkpoint_dir=training/ --eval_dir=eval/ --pipeline_config_path=training/faster_rcnn_inception_resnet_v2_atrous_oid.config
我收到以下错误,
paperspace@psnu680y1:~/models-master/research/object_detection$ python3 eval.py --logtostderr --checkpoint_dir = training/ --eval_dir=eval/ --pipeline_config_path=training/faster_rcnn_inception_resnet_v2_atrous_oid.config
INFO:tensorflow:Scale of 0 disables regularizer.
INFO:tensorflow:Scale of 0 disables regularizer.
INFO:tensorflow:Scale of 0 disables regularizer.
INFO:tensorflow:Scale of 0 disables regularizer.
INFO:tensorflow:depth of additional conv before box predictor: 0
INFO:tensorflow:Scale of 0 disables regularizer.
INFO:tensorflow:Scale of 0 disables regularizer.
Traceback (most recent call last):
File "eval.py", line 133, in <module>
tf.app.run()
File "/home/paperspace/anaconda3/lib/python3.6/site-packages/tensorflow/python/platform/app.py", line 48, in run
_sys.exit(main(_sys.argv[:1] + flags_passthrough))
File "eval.py", line 129, …Run Code Online (Sandbox Code Playgroud) 我使用TensorFlow对象检测API运行SSD MobileNetV2,运行以下代码后
(keras-cpu-exp) D:\Pycharm Projects\CPU\models\research\object_detection>python train.py --logtostderr --train_dir=training/ --pipeline_config_path=training/ssd_mobilenet_v2_coco.config
Run Code Online (Sandbox Code Playgroud)
我收到了错误
TypeError:pred必须是Tensor,或Python bool,或1或0.发现:无
(keras-cpu-exp) D:\Pycharm Projects\CPU\models\research\object_detection>python train.py --logtostderr --train_dir=training/ --pipeline_config_path=training/ssd_mobilenet_v2_coco.config
C:\Users\Reagan\AppData\Local\Continuum\Anaconda3\envs\keras-cpu-exp\lib\site-packages\h5py\__init__.py:34: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`.
from ._conv import register_converters as _register_converters
WARNING:tensorflow:From C:\Users\Reagan\AppData\Local\Continuum\Anaconda3\envs\keras-cpu-exp\lib\site-packages\tensorflow\contrib\learn\python\learn\datasets\base.py:198: retry (from tensorflow.contrib.learn.python.learn.datasets.base) is deprecated and will be removed in a future version.
Instructions for updating:
Use the retry module or similar alternatives.
Traceback (most recent …Run Code Online (Sandbox Code Playgroud) 有没有办法在对象检测API的配置文件中指定Hyperopt等超参数优化来微调模型?
object-detection hyperparameters tensorflow object-detection-api
我试图使用来自object_detect_api的更快的rcnn模型从此处https://github.com/tensorflow/models/tree/master/research/ object_detection
我对以下配置参数感到非常困惑(我已经阅读了原始文件,还尝试修改它们并进行测试):
first_stage_anchor_generator {
grid_anchor_generator {
scales: [0.25, 0.5, 1.0, 2.0]
aspect_ratios: [0.5, 1.0, 2.0]
height_stride: 16
width_stride: 16
}
}
Run Code Online (Sandbox Code Playgroud)
我对这个领域有点陌生,如果有人可以向我解释一下这些参数,将不胜感激。
我的问题是我应该如何调整以上(或其他)参数,以适应我有非常小的固定大小对象要在大图像中检测到的事实。
谢谢
我正在使用Tensorflow的对象检测API解决对象检测问题,特别是facessd在开放图像数据集上训练的模型。任何人都可以澄清
anchor_strides must be a list with the same length as self.box_specs
Run Code Online (Sandbox Code Playgroud)
手段?我正在查看源代码,但是找不到在哪里self._box_specs定义。我假设这是在推断过程中最终绘制的边界框。我尝试调整图像的大小,但没有任何改变。
每次运行模型时,都会出现以下错误和回溯:
Traceback (most recent call last):
File "object_detection/model_main.py", line 109, in <module>
tf.app.run()
File "/home/ubuntu/anaconda3/lib/python3.6/site-packages/tensorflow/python/platform/app.py", line 125, in run
_sys.exit(main(argv))
File "object_detection/model_main.py", line 105, in main
tf.estimator.train_and_evaluate(estimator, train_spec, eval_specs[0])
File "/home/ubuntu/anaconda3/lib/python3.6/site-packages/tensorflow/python/estimator/training.py", line 471, in train_and_evaluate
return executor.run()
File "/home/ubuntu/anaconda3/lib/python3.6/site-packages/tensorflow/python/estimator/training.py", line 610, in run
return self.run_local()
File "/home/ubuntu/anaconda3/lib/python3.6/site-packages/tensorflow/python/estimator/training.py", line 711, in run_local
saving_listeners=saving_listeners)
File "/home/ubuntu/anaconda3/lib/python3.6/site-packages/tensorflow/python/estimator/estimator.py", line 354, in train
loss = self._train_model(input_fn, hooks, saving_listeners) …Run Code Online (Sandbox Code Playgroud) 我正在使用 tensorflow 对象检测 API,并且我希望能够在 python 中动态编辑配置文件,如下所示。我想过在 python 中使用协议缓冲区库,但我不知道如何去做。
model {
ssd {
num_classes: 1
image_resizer {
fixed_shape_resizer {
height: 300
width: 300
}
}
feature_extractor {
type: "ssd_inception_v2"
depth_multiplier: 1.0
min_depth: 16
conv_hyperparams {
regularizer {
l2_regularizer {
weight: 3.99999989895e-05
}
}
initializer {
truncated_normal_initializer {
mean: 0.0
stddev: 0.0299999993294
}
}
activation: RELU_6
batch_norm {
decay: 0.999700009823
center: true
scale: true
epsilon: 0.0010000000475
train: true
}
}
...
...
Run Code Online (Sandbox Code Playgroud)
}
是否有一种简单/简单的方法可以将 image_resizer -> fixed_shape_resizer 中的高度等字段的特定值从 300 更改为 500?并用修改后的值写回文件而不更改任何其他内容?
编辑:虽然@DmytroPrylipko …
我想在使用 Google Cloud 的新测试集上评估自定义训练的 Tensorflow 对象检测模型。
我从以下位置获得了初始检查点:https : //github.com/tensorflow/models/blob/master/research/object_detection/g3doc/detection_model_zoo.md
我知道 Tensorflow 对象检测 API 允许我使用以下方法同时运行训练和评估:
https://github.com/tensorflow/models/blob/master/research/object_detection/model_main.py
要开始这样的工作,我提交以下 ml-engine 工作:
gcloud ml-engine jobs submit training [JOBNAME]
--runtime-version 1.9
--job-dir=gs://path_to_bucket/model-dir
--packages dist/object_detection-
0.1.tar.gz,slim/dist/slim-0.1.tar.gz,pycocotools-2.0.tar.gz
--module-name object_detection.model_main
--region us-central1
--config object_detection/samples/cloud/cloud.yml
--
--model_dir=gs://path_to_bucket/model_dir
--pipeline_config_path=gs://path_to_bucket/data/model.config
Run Code Online (Sandbox Code Playgroud)
但是,在成功转移训练模型后,我想使用计算性能指标,例如 COCO mAP( http://cocodataset.org/#detection-eval ) 或 PASCAL mAP ( http://host.robots. ox.ac.uk/pascal/VOC/pubs/everingham10.pdf)在一个以前没有使用过的新测试数据集上(无论是在训练期间还是在评估期间)。
我已经看到,model_main.py 中有可能的标志:
flags.DEFINE_string(
'checkpoint_dir', None, 'Path to directory holding a checkpoint. If '
'`checkpoint_dir` is provided, this binary operates in eval-only
mode, '
'writing resulting metrics to `model_dir`.') …Run Code Online (Sandbox Code Playgroud)