Rob*_*ert 8 linux postgresql python-3.x systemctl airflow-2.x
新的 DAG 或对现有 DAG 的更改不会显示在 Airflow Web 服务器上以供在应用程序中使用。
例如。假设我在 DAG 目录中添加一个新的 DAG:
$ airflow dags list
那么 dag 就会出现。select dag_id from dag;
,则存在新的 dag。因此它被拾取并放入数据库中。dag_code
,源代码没有被更新dag_code
。$ airflow db init
再次运行,那么更改就会生效并且一切正常......所以我的系统是稳定且可用的,因为我正在破解以下功能:$ airflow db init
。由于运行此命令不会影响数据库中的数据,因此我实际上可以像这样工作,只需在每次发生更改时运行该命令即可。但我担心,因为这并没有按预期发挥作用,它可能掩盖了更深层次的问题。
任何帮助将不胜感激。我在下面列出了我的系统规格和气流设置。
airflow.cfg
相关参数:
dags_folder = /home/svc-air-analytics/airflow/dags
base_log_folder = /home/svc-air-analytics/airflow/logs
...
# The executor class that airflow should use. Choices include
# SequentialExecutor, LocalExecutor, CeleryExecutor, DaskExecutor, KubernetesExecutor
executor = LocalExecutor
# The SqlAlchemy connection string to the metadata database.
# SqlAlchemy supports many different database engine, more information
# their website
sql_alchemy_conn = postgresql+psycopg2://svc-air-analytics:***@localhost:5432/svc_air_analytics
...
# after how much time (seconds) a new DAGs should be picked up from the filesystem
min_file_process_interval = 0
# How often (in seconds) to scan the DAGs directory for new files. Default to 5 minutes.
dag_dir_list_interval = 10
...
Run Code Online (Sandbox Code Playgroud)
我正在使用用户: 的 AWS EC2 实例上运行svc-air-analytics
。关键地点:
airflow.cfg
地点:/home/svc-air-analytics/airflow/airflow.cfg
dags
地点 :/home/svc-air-analytics/airflow/dags/
/home/env_svc_air_analytics
/etc/sysconfig/airflow
:AIRFLOW_CONFIG=/home/svc-air-analytics/airflow/airflow.cfg
AIRFLOW_HOME=/home/svc-air-analytics/airflow
export PATH=$PATH:/home/svc-air-analytics/env_svc_air_analytics/bin/
Run Code Online (Sandbox Code Playgroud)
/usr/lib/systemd/system/airflow-scheduler.service
::[Unit]
Description=Airflow scheduler daemon
After=network.target postgresql.service
Wants=postgresql.service
[Service]
EnvironmentFile=/etc/sysconfig/airflow
User=svc-air-analytics
Group=airflow
Type=simple
ExecStart=/home/svc-air-analytics/env_svc_air_analytics/bin/airflow scheduler
Restart=always
RestartSec=5s
[Install]
WantedBy=multi-user.target
Run Code Online (Sandbox Code Playgroud)
[Unit]
Description=Airflow webserver daemon
After=network.target postgresql.service
Wants=postgresql.service
[Service]
Environment="PATH=/home/svc-air-analytics/env_svc_air_analytics/bin:/home/svc-air-analytics/airflow/"
User=svc-air-analytics
Group=airflow
Type=simple
ExecStart=/home/svc-air-analytics/env_svc_air_analytics/bin/python /home/svc-air-analytics/env_svc_air_analytics/bin/airflow webserver --pid /run/airflow/webserver.pid
Restart=on-failure
RestartSec=5s
PrivateTmp=true
[Install]
WantedBy=multi-user.target
Run Code Online (Sandbox Code Playgroud)