Airflow + Dockeroperator 无法使用 mounts 参数传递挂载/卷

Ani*_*Ani 4 docker airflow dockeroperator

我正在尝试将本地目录作为卷传递给气流,而气流又被传递给 dag DockerOperator

\n

我的airflow-docker-compose.yaml(在airflow-common部分)如下所示

\n
  volumes:\n    - ./dags:/opt/airflow/dags\n    - ./logs:/opt/airflow/logs\n    - ./plugins:/opt/airflow/plugins\n    - ./input:/opt/airflow/input\n    - ./output:/opt/airflow/output\n
Run Code Online (Sandbox Code Playgroud)\n

在 DAG 代码中,我尝试传递安装参数,如下所示:

\n
eod_price = DockerOperator(\n        task_id='run_docker',\n        image='alpine',\n        api_version='auto',\n        command='/bin/touch /output/run_docker_touch.txt',\n        auto_remove=True,\n        mounts=[\n            Mount(source='/opt/airflow/output',\n                  target='/app_base/output',\n                  type='volume'),\n        ],\n        mount_tmp_dir=False,\n        docker_url='tcp://docker-proxy:2375',\n        network_mode='bridge'\n    )\n
Run Code Online (Sandbox Code Playgroud)\n

使用这段代码,我收到错误:

\n
\n

错误请求 ("create /opt/airflow/output: "/opt/airflow/output"\n仅包含本地卷名称的无效字符\n"[a-zA-Z0-9][a-zA-Z0- 9_.-]" 是允许的。如果您打算传递\n主机目录,请使用绝对路径")

\n
\n

当我将安装行从 type='volume' 更改为 type='bind' 时:

\n
mounts=[\n            Mount(source='/opt/airflow/output',\n                  target='/app_base/output',\n                  type='bind'),\n        ],\n
Run Code Online (Sandbox Code Playgroud)\n

错误更改为

\n
\n

错误请求(“类型“bind”的安装配置无效:绑定源路径\n不存在:/opt/airflow/output”)

\n
\n

我猛击了 docker007-airflow-scheduler-1 、 docker007-airflow-triggerer-1 、 docker007-airflow-webserver-1 、 docker007-airflow-worker-1 ,在每个容器中我看到 /opt/airflow/output 目录其中 docker-compose ps 输出是

\n
NAME                            COMMAND                  SERVICE             STATUS              PORTS\ndocker007-airflow-init-1        "/bin/bash -c 'funct\xe2\x80\xa6"   airflow-init        exited (0)          \ndocker007-airflow-scheduler-1   "/usr/bin/dumb-init \xe2\x80\xa6"   airflow-scheduler   running (healthy)   8080/tcp\ndocker007-airflow-triggerer-1   "/usr/bin/dumb-init \xe2\x80\xa6"   airflow-triggerer   running (healthy)   8080/tcp\ndocker007-airflow-webserver-1   "/usr/bin/dumb-init \xe2\x80\xa6"   airflow-webserver   running (healthy)   0.0.0.0:8080->8080/tcp\ndocker007-airflow-worker-1      "/usr/bin/dumb-init \xe2\x80\xa6"   airflow-worker      running (healthy)   8080/tcp\ndocker007-docker-proxy-1        "socat TCP4-LISTEN:2\xe2\x80\xa6"   docker-proxy        running             0.0.0.0:2376->2375/tcp\ndocker007-postgres-1            "docker-entrypoint.s\xe2\x80\xa6"   postgres            running (healthy)   5432/tcp\ndocker007-redis-1               "docker-entrypoint.s\xe2\x80\xa6"   redis               running (healthy)   6379/tcp\n
Run Code Online (Sandbox Code Playgroud)\n

我的最终目标是我希望写入 /app_base/output 的 dockerized python 应用程序的输出(此 echo 命令只是示例)在本地输出目录中可见

\n

Ani*_*Ani 5

好吧,终于想通了。源路径不是气流卷安装目录,而是启动气流应用程序的原始源目录。

以下代码有效

pre_step = DockerOperator(
    task_id='pre_step_docker',
    image='alpine',
    api_version='auto',
    command='/bin/touch /output/run_docker_touch.txt',
    auto_remove=True,
    mounts=[
        Mount(source='/Users/me/Documents/Python/GitHub/docker-learn/docker007/output',
              target='/output',
              type='bind'),
    ],
    mount_tmp_dir=False,
    docker_url='tcp://docker-proxy:2375',
    network_mode='bridge'
)
Run Code Online (Sandbox Code Playgroud)

换句话说,docker-compose 中挂载的卷不是必需的。我已经注释掉了我添加的内容

  volumes:
    - ./dags:/opt/airflow/dags
    - ./logs:/opt/airflow/logs
    - ./plugins:/opt/airflow/plugins
#    - ./input:/opt/airflow/input
#    - ./output:/opt/airflow/output
Run Code Online (Sandbox Code Playgroud)

然而,这个硬编码路径对我来说也非常难看,所以我在 docker-compose 文件中添加了新变量(在 airflow-common --> 环境部分)

APP_INPUT_DIR: ${PWD}/input
APP_OUTPUT_DIR: ${PWD}/output
Run Code Online (Sandbox Code Playgroud)

并更新了我的代码如下:

pre_step = DockerOperator(
    task_id='pre_step_docker',
    image='alpine',
    api_version='auto',
    command='/bin/touch /output/run_docker_touch.txt',
    auto_remove=True,
    mounts=[
        Mount(source=os.getenv("APP_OUTPUT_DIR"),
              target='/output',
              type='bind'),
    ],
    mount_tmp_dir=False,
    docker_url='tcp://docker-proxy:2375',
    network_mode='bridge'
)
Run Code Online (Sandbox Code Playgroud)

希望它能帮助别人!