TensorFlow:如何跳过损坏的数据

0x2*_*207 6 tensorflow

我正在玩 TensorFlow 1.0。我的输入数据是大量 jpeg 图像。其中一些由于不同的原因而损坏,我只想在输入时跳过它们。

图表的图像加载部分如下:

filename_queue = tf.train.string_input_producer(tf.train.match_filenames_once(filename_list), capacity=1000, num_epochs=1)
whole_file_reader = tf.WholeFileReader()
_, image_binary = whole_file_reader.read(filename_queue)
image_tensor = tf.cast(tf.image.decode_jpeg(image_binary), tf.float32)
Run Code Online (Sandbox Code Playgroud)

模型运行部分照常:

with sv.managed_session() as sess:
        sess.run(init_local)
        sess.run(init_all)

        coord = tf.train.Coordinator()
        threads = tf.train.start_queue_runners(coord=coord, sess=sess)

        try:
                while not coord.should_stop() and not sv.should_stop():
                        sess.run(accumulator)
        except tf.errors.OutOfRangeError:
                print('Done training -- epoch limit reached')
                #
        except Exception as e:
                # Report exceptions to the coordinator.
                coord.request_stop(e)
        finally:
                coord.request_stop()

        coord.request_stop()
        coord.join(threads)
Run Code Online (Sandbox Code Playgroud)

运行此代码时,我看到以下内容,但我无法弄清楚如何正确捕获此异常。

Traceback (most recent call last):
  File "/home/matwey/venv/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1022, in _do_call
    return fn(*args)
  File "/home/matwey/venv/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1004, in _run_fn
    status, run_metadata)
  File "/usr/lib/python3.5/contextlib.py", line 66, in __exit__
    next(self.gen)
  File "/home/matwey/venv/lib/python3.5/site-packages/tensorflow/python/framework/errors_impl.py", line 469, in raise_exception_on_not_ok_status
    pywrap_tensorflow.TF_GetCode(status))
tensorflow.python.framework.errors_impl.InvalidArgumentError: Invalid JPEG data, size 0
         [[Node: DecodeJpeg = DecodeJpeg[acceptable_fraction=1, channels=0, dct_method="", fancy_upscaling=true, ratio=1, try_recover_truncated=false, _device="/job:localhost/replica:0/task:0/cpu:0"](ReaderReadV2:1)]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "calculate_mean.py", line 67, in <module>
    coord.join(threads)
  File "/usr/lib/python3.5/contextlib.py", line 66, in __exit__
    next(self.gen)
  File "/home/matwey/venv/lib/python3.5/site-packages/tensorflow/python/training/supervisor.py", line 973, in managed_session
    self.stop(close_summary_writer=close_summary_writer)
  File "/home/matwey/venv/lib/python3.5/site-packages/tensorflow/python/training/supervisor.py", line 801, in stop
    stop_grace_period_secs=self._stop_grace_secs)
  File "/home/matwey/venv/lib/python3.5/site-packages/tensorflow/python/training/coordinator.py", line 386, in join
    six.reraise(*self._exc_info_to_raise)
  File "/usr/lib/python3/dist-packages/six.py", line 686, in reraise
    raise value
  File "/home/matwey/venv/lib/python3.5/site-packages/tensorflow/python/training/queue_runner_impl.py", line 234, in _run
    sess.run(enqueue_op)
  File "/home/matwey/venv/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 767, in run
    run_metadata_ptr)
  File "/home/matwey/venv/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 965, in _run
    feed_dict_string, options, run_metadata)
  File "/home/matwey/venv/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1015, in _do_run
    target_list, options, run_metadata)
  File "/home/matwey/venv/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1035, in _do_call
    raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Invalid JPEG data, size 0
         [[Node: DecodeJpeg = DecodeJpeg[acceptable_fraction=1, channels=0, dct_method="", fancy_upscaling=true, ratio=1, try_recover_truncated=false, _device="/job:localhost/replica:0/task:0/cpu:0"](ReaderReadV2:1)]]

Caused by op 'DecodeJpeg', defined at:
  File "calculate_mean.py", line 19, in <module>
    image_tensor = tf.cast(tf.image.decode_jpeg(image_binary), tf.float32)
  File "/home/matwey/venv/lib/python3.5/site-packages/tensorflow/python/ops/gen_image_ops.py", line 345, in decode_jpeg
    dct_method=dct_method, name=name)
  File "/home/matwey/venv/lib/python3.5/site-packages/tensorflow/python/framework/op_def_library.py", line 763, in apply_op
    op_def=op_def)
  File "/home/matwey/venv/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 2395, in create_op
    original_op=self._default_original_op, op_def=op_def)
  File "/home/matwey/venv/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 1264, in __init__
    self._traceback = _extract_stack()

InvalidArgumentError (see above for traceback): Invalid JPEG data, size 0
         [[Node: DecodeJpeg = DecodeJpeg[acceptable_fraction=1, channels=0, dct_method="", fancy_upscaling=true, ratio=1, try_recover_truncated=false, _device="/job:localhost/replica:0/task:0/cpu:0"](ReaderReadV2:1)]]
Run Code Online (Sandbox Code Playgroud)

不幸的是,在 Tensorflow 中跳过不存在或损坏的文件中给出的答案 对我不起作用。就我而言,似乎引发了异常,但coord.join(threads)为时已晚。

Rob*_*obR 4

回复晚了非常抱歉。答案可能包含在您的错误消息中:

tensorflow.python.framework.errors_impl.InvalidArgumentError:无效的JPEG数据,大小0 [[节点:DecodeJpeg = DecodeJpegacceptable_fraction = 1,通道= 0,dct_method =“”,fancy_upscaling = true,ratio = 1,try_recover_truncated = false,_device = /job:localhost/replica:0/task:0/cpu:0"]]

无论出于何种原因,JPEG 文件都可能被损坏。但是,您使用了需要完美解码的默认设置tf.image_decode_jpeg。相反,您可能希望通过设置参数try_recover_truncated = Trueacceptable_fraction=0.5(或其他)来允许一些错误。请参阅此链接了解更多信息。