提交培训作业后,"没有此类文件或目录"错误

Ami*_*ati 5 google-cloud-ml

我执行:

gcloud beta ml jobs submit training ${JOB_NAME} --config config.yaml

大约5分钟后,作业错误出现此错误:

Traceback (most recent call last): 
File "/usr/lib/python2.7/runpy.py", line 162, in _run_module_as_main "__main__", fname, loader, pkg_name) 
File "/usr/lib/python2.7/runpy.py", line 72, in _run_code exec code in run_globals 
File "/root/.local/lib/python2.7/site-packages/trainer/task.py", line 232, in <module> tf.app.run() 
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/app.py", line 30, in run sys.exit(main(sys.argv[:1] + flags_passthrough)) 
File "/root/.local/lib/python2.7/site-packages/trainer/task.py", line 228, in main run_training() 
File "/root/.local/lib/python2.7/site-packages/trainer/task.py", line 129, in run_training data_sets = input_data.read_data_sets(FLAGS.train_dir, FLAGS.fake_data) 
File "/usr/local/lib/python2.7/dist-packages/tensorflow/contrib/learn/python/learn/datasets/mnist.py", line 212, in read_data_sets with open(local_file, 'rb') as f: IOError: [Errno 2] No such file or directory: 'gs://my-bucket/mnist/train/train-images.gz'
Run Code Online (Sandbox Code Playgroud)

奇怪的是,据我所知,该文件存在于该URL.

Ami*_*ati 3

此错误通常表明您正在使用多区域 GCS 存储桶进行输出。为了避免此错误,您应该使用区域 GCS 存储桶。区域存储桶提供了更强的一致性保证,这是避免此类错误所必需的。

有关为 Cloud ML 正确设置 GCS 存储桶的更多信息,请参阅Cloud ML 文档