如何将 DBT 与 AWS Managed Airflow 结合使用?

nar*_*er1 8 amazon-web-services airflow dbt mwaa

希望你一切顺利。我想检查是否有人在 aws mwaa 气流中启动并运行 dbt。

我尝试过这个这个python 包但没有成功,但由于某种原因失败(找不到 dbt 路径等)。

有没有人成功地使用 MWAA (Airflow 2) 和 DBT,而无需构建 docker 镜像并将其放置在某个地方?

谢谢你!

Yon*_*ron 7

我已经通过执行以下步骤解决了这个问题:

\n
    \n
  1. 添加dbt-core==0.19.1到您的requirements.txt
  2. \n
  3. 将 DBT cli 可执行文件添加到 plugins.zip 中
  4. \n
\n
#!/usr/bin/env python3\n# EASY-INSTALL-ENTRY-SCRIPT: \'dbt-core==0.19.1\',\'console_scripts\',\'dbt\'\n__requires__ = \'dbt-core==0.19.1\'\nimport re\nimport sys\nfrom pkg_resources import load_entry_point\n\nif __name__ == \'__main__\':\n    sys.argv[0] = re.sub(r\'(-script\\.pyw?|\\.exe)?$\', \'\', sys.argv[0])\n    sys.exit(\n        load_entry_point(\'dbt-core==0.19.1\', \'console_scripts\', \'dbt\')()\n    )\n
Run Code Online (Sandbox Code Playgroud)\n

从这里你有两个选择:

\n
    \n
  1. dbt_bin 将运算符参数设置为/usr/local/airflow/plugins/dbt
  2. \n
  3. 按照文档添加/usr/local/airflow/plugins/$PATH
  4. \n
\n

环境变量设置器示例:

\n
from airflow.plugins_manager import AirflowPlugin\nimport os\n\nos.environ["PATH"] = os.getenv(\n    "PATH") + ":/usr/local/airflow/.local/lib/python3.7/site-packages:/usr/local/airflow/plugins/"\n\n\nclass EnvVarPlugin(AirflowPlugin):\n    name = \'env_var_plugin\'\n
Run Code Online (Sandbox Code Playgroud)\n

插件压缩包内容:

\n
plugins.zip\n\xe2\x94\x9c\xe2\x94\x80\xe2\x94\x80 dbt (DBT cli executable)\n\xe2\x94\x94\xe2\x94\x80\xe2\x94\x80 env_var_plugin.py (environment variable setter)\n
Run Code Online (Sandbox Code Playgroud)\n

  • @Smit 当然,请查看 MWAA 文档 https://docs.aws.amazon.com/mwaa/latest/userguide/configuring-dag-import-plugins.html (2认同)