我正在使用气流 cli 的backfill命令手动运行一些回填作业。
airflow backfill mydag -i -s 2018-01-11T16-00-00 -e 2018-01-31T23-00-00 --reset_dagruns --rerun_failed_tasks
Run Code Online (Sandbox Code Playgroud)
dag 间隔是每小时一次,大约有 40 个任务。因此,这种回填工作需要一天多的时间才能完成。我需要它在没有监督的情况下运行。但是,我注意到,即使一项任务在回填间隔中的一次运行中失败,整个回填作业也会因以下异常而停止,我必须再次手动重新启动它。
Traceback (most recent call last):
File "/home/ubuntu/airflow/bin/airflow", line 4, in <module>
__import__('pkg_resources').run_script('apache-airflow==1.10.0', 'airflow')
File "/home/ubuntu/airflow/lib/python3.5/site-packages/pkg_resources/__init__.py"
, line 719, in run_script
self.require(requires)[0].run_script(script_name, ns)
File "/home/ubuntu/airflow/lib/python3.5/site-packages/pkg_resources/__init__.py", line 1504, in run_script
exec(code, namespace, namespace)
File "/home/ubuntu/airflow/lib/python3.5/site-packages/apache_airflow-1.10.0-py3.
5.egg/EGG-INFO/scripts/airflow", line 32, in <module>
args.func(args)
File "/home/ubuntu/airflow/lib/python3.5/site-packages/apache_airflow-1.10.0-py3.5.egg/airflow/utils/cli.py", line 74, in wrapper
return f(*args, **kwargs)
File "/home/ubuntu/airflow/lib/python3.5/site-packages/apache_airflow-1.10.0-py3.
5.egg/airflow/bin/cli.py", line 217, in backfill
rerun_failed_tasks=args.rerun_failed_tasks,
File "/home/ubuntu/airflow/lib/python3.5/site-packages/apache_airflow-1.10.0-py3.5.egg/airflow/models.py", line 4105, in …Run Code Online (Sandbox Code Playgroud) 我试过用dict.fromkeys([1,2,3],set()).这个初始化创建了字典,但是当我向任何一个集合添加一个值时,所有集合都会更新!
>>> d=dict.fromkeys([1,2,3],set())
>>> d
>>> {1: set(), 2: set(), 3: set()}
>>> d[1].add('a')
>>> d
>>> {1: {'a'}, 2: {'a'}, 3: {'a'}}
Run Code Online (Sandbox Code Playgroud)
似乎字典的所有三个值都指的是同一个集合.我想将字典的所有值初始化为空集,以便我可以稍后根据键在循环中对这些集执行某些操作.