由于 gunicorn,气流系统失败

Lud*_*ses 7 gunicorn systemd airflow

我无法使用 systemd 启动气流网络服务器,即使它在 systemd 之外启动并正常运行,如下所示:

export AIRFLOW_HOME=/path/to/my/airflow/home ; airflow webserver -p 8080
Run Code Online (Sandbox Code Playgroud)

systemd 日志让我相信问题来自于 gunicorn,即使当我运行上述命令时,gunicorn 启动时没有问题(即它只是 systemd 中的一个问题)。我已经根据气流文档(运行 Ubuntu 16)配置了以下 systemd 文件。

/etc/default/airflow

AIRFLOW_HOME=/path/to/my/airflow/home
SCHEDULER_RUNS=5
Run Code Online (Sandbox Code Playgroud)

/lib/systemd/system/airflow-webserver.service

[Unit]
Description=Airflow webserver daemon   
After=network.target

[Service]
EnvironmentFile=/etc/default/airflow
User=ubuntu
Group=ubuntu
Type=simple
ExecStart=/bin/bash -c "export AIRFLOW_HOME=/path/to/my/airflow/home ; airflow webserver -p 8080 "

Restart=on-failure
RestartSec=5s
PrivateTmp=true

[Install]
WantedBy=multi-user.target
Run Code Online (Sandbox Code Playgroud)

/etc/tmpfiles.d/airflow.conf

D /run/airflow 0755 airflow airflow
Run Code Online (Sandbox Code Playgroud)

当我使用 systemctl 启动服务时,这会导致以下错误。

systemctl start airflow-webserver.service

Jul 15 22:41:27 ip-172-31-19-64 systemd[1]: Started Airflow webserver daemon.
Jul 15 22:41:27 ip-172-31-19-64 bash[31494]: [2018-07-15 22:41:27,555] {driver.py:120} INFO - Generating grammar tables from /usr/lib/python3.5/lib2to3/Grammar.txt
Jul 15 22:41:27 ip-172-31-19-64 bash[31494]: [2018-07-15 22:41:27,592] {driver.py:120} INFO - Generating grammar tables from /usr/lib/python3.5/lib2to3/PatternGrammar.txt
Jul 15 22:41:27 ip-172-31-19-64 bash[31494]: [2018-07-15 22:41:27,729] {__init__.py:45} INFO - Using executor SequentialExecutor
Jul 15 22:41:28 ip-172-31-19-64 bash[31494]:   ____________       _____________
Jul 15 22:41:28 ip-172-31-19-64 bash[31494]:  ____    |__( )_________  __/__  /________      __
Jul 15 22:41:28 ip-172-31-19-64 bash[31494]: ____  /| |_  /__  ___/_  /_ __  /_  __ \_ | /| / /
Jul 15 22:41:28 ip-172-31-19-64 bash[31494]: ___  ___ |  / _  /   _  __/ _  / / /_/ /_ |/ |/ /
Jul 15 22:41:28 ip-172-31-19-64 bash[31494]:  _/_/  |_/_/  /_/    /_/    /_/  \____/____/|__/
Jul 15 22:41:28 ip-172-31-19-64 bash[31494]:
Jul 15 22:41:28 ip-172-31-19-64 bash[31494]: [2018-07-15 22:41:28,042] {models.py:189} INFO - Filling up the DagBag from /path/to/my/airflow/home/dags
Jul 15 22:41:28 ip-172-31-19-64 bash[31494]: /home/ubuntu/.local/lib/python3.5/site-packages/flask/exthook.py:71: ExtDeprecationWarning: Importing flask.ext.cache is deprecated, use flask_cach
Jul 15 22:41:28 ip-172-31-19-64 bash[31494]:   .format(x=modname), ExtDeprecationWarning
Jul 15 22:41:28 ip-172-31-19-64 bash[31494]: Running the Gunicorn Server with:
Jul 15 22:41:28 ip-172-31-19-64 bash[31494]: Workers: 4 sync
Jul 15 22:41:28 ip-172-31-19-64 bash[31494]: Host: 0.0.0.0:8080
Jul 15 22:41:28 ip-172-31-19-64 bash[31494]: Timeout: 120
Jul 15 22:41:28 ip-172-31-19-64 bash[31494]: Logfiles: - -
Jul 15 22:41:28 ip-172-31-19-64 bash[31494]: =================================================================
Jul 15 22:41:28 ip-172-31-19-64 bash[31494]: Traceback (most recent call last):
Jul 15 22:41:28 ip-172-31-19-64 bash[31494]:   File "/usr/local/bin/airflow", line 27, in <module>
Jul 15 22:41:28 ip-172-31-19-64 bash[31494]:     args.func(args)
Jul 15 22:41:28 ip-172-31-19-64 bash[31494]:   File "/usr/local/lib/python3.5/dist-packages/airflow/bin/cli.py", line 788, in webserver
Jul 15 22:41:28 ip-172-31-19-64 bash[31494]:     gunicorn_master_proc = subprocess.Popen(run_args)
Jul 15 22:41:28 ip-172-31-19-64 bash[31494]:   File "/usr/lib/python3.5/subprocess.py", line 947, in __init__
Jul 15 22:41:28 ip-172-31-19-64 bash[31494]:     restore_signals, start_new_session)
Jul 15 22:41:28 ip-172-31-19-64 bash[31494]:   File "/usr/lib/python3.5/subprocess.py", line 1551, in _execute_child
Jul 15 22:41:28 ip-172-31-19-64 bash[31494]:     raise child_exception_type(errno_num, err_msg)
Jul 15 22:41:28 ip-172-31-19-64 bash[31494]: FileNotFoundError: [Errno 2] No such file or directory: 'gunicorn'
Jul 15 22:41:28 ip-172-31-19-64 systemd[1]: airflow-webserver.service: Main process exited, code=exited, status=1/FAILURE
Run Code Online (Sandbox Code Playgroud)

我需要做一些配置才能使 gunicorn 与 systemd 兼容吗?

编辑:根据有关这是权限问题的建议,我通过以下方式安装了 gunicorn:sudo apt-get install gunicorn并在重新运行 systemctl 时出现以下错误Error: No module named airflow.www.gunicorn_config。我认为这是由于我刚刚安装的 gunicorn 与我的 ubuntu 用户用来运行气流的 gunicorn 不一致,所以我将 /usr/bin/ 中的 gunicorn 替换为前者。这个修补程序可能不是修复的最佳方法,但后来我成功地通过 systemd 运行了气流。

小智 5

我在 /srv/airflow 下的虚拟环境中安装的 Ubuntu 18.04 LTS 和 Apache Airflow 版本 1.10.1 上遇到了同样的问题。经过大量试验和错误,我最终得到了这个可行的解决方案。

我的气流 webserver.service 文件:

[Unit]
Description=Airflow webserver daemon
After=network.target

[Service]
Environment="PATH=/srv/airflow/bin"
Environment="AIRFLOW_HOME=/srv/airflow"
User=airflow
Group=airflow
Type=simple
ExecStart=/srv/airflow/bin/airflow webserver --pid /srv/airflow/webserver.pid
Restart=on-failure
RestartSec=5s
PrivateTmp=true

[Install]
WantedBy=multi-user.target
Run Code Online (Sandbox Code Playgroud)

我这样做是为了安装服务:

sudo cp airflow-webserver.service /lib/systemd/system/
sudo systemctl daemon-reload
sudo systemctl enable airflow-webserver.service
sudo systemctl start airflow-webserver.service
Run Code Online (Sandbox Code Playgroud)