气流 DockerOperator:连接 sock.connect(self.unix_socket) FileNotFoundError:[Errno 2] 没有那个文件或目录

9 python operator-keyword docker airflow

我正在尝试DockerOperator在 Mac 上使用 Airflow。我正在运行基于Puckel 的Airflow,并了一些小的修改。

Dockerfile 构建为 puckle-airflow-with-docker-inside:

FROM puckel/docker-airflow:latest

USER root
RUN groupadd --gid 999 docker \
&& usermod -aG docker airflow
USER airflow
Run Code Online (Sandbox Code Playgroud)

docker-compose-CeleryExecutor.yml.:

FROM puckel/docker-airflow:latest

USER root
RUN groupadd --gid 999 docker \
&& usermod -aG docker airflow
USER airflow
Run Code Online (Sandbox Code Playgroud)

DAG 中的任务/操作定义:

version: '2.1'

services:
    redis:
        image: 'redis:5.0.5'

    postgres:
        image: postgres:9.6
        environment:
            - POSTGRES_USER=airflow
            - POSTGRES_PASSWORD=airflow
            - POSTGRES_DB=airflow
    webserver:
        image: puckel-airflow-with-docker-inside:latest
        restart: always
        depends_on:
            - postgres
            - redis
        environment:
            - LOAD_EX=n
            - FERNET_KEY=46BKJoQYlPPOexq0OhDZnIlNepKFf87WFwLbfzqDDho=
            - EXECUTOR=Celery
        volumes:
            - ./requirements.txt:/requirements.txt
            - ./dags:/usr/local/airflow/dags
        ports:
            - "8080:8080"
        command: webserver
        healthcheck:
            test: ["CMD-SHELL", "[ -f /usr/local/airflow/airflow-webserver.pid ]"]
            interval: 30s
            timeout: 30s
            retries: 3

    flower:
        image: puckel-airflow-with-docker-inside:latest
        restart: always
        depends_on:
            - redis
        environment:
            - EXECUTOR=Celery
        ports:
            - "5555:5555"
        command: flower

    scheduler:
        image: puckel-airflow-with-docker-inside:latest
        restart: always
        depends_on:
            - webserver
        volumes:
            - ./dags:/usr/local/airflow/dags
            - ./requirements.txt:/requirements.txt
        environment:
            - LOAD_EX=n
            - FERNET_KEY=46BKJoQYlPPOexq0OhDZnIlNepKFf87WFwLbfzqDDho=
            - EXECUTOR=Celery

        command: scheduler

    worker:
        image: puckel-airflow-with-docker-inside:latest
        restart: always
        depends_on:
            - scheduler
        volumes:
          - ./dags:/usr/local/airflow/dags
          - ./requirements.txt:/requirements.txt
        environment:
          - DOCKER_HOST=tcp://socat:2375
          - FERNET_KEY=46BKJoQYlPPOexq0OhDZnIlNepKFf87WFwLbfzqDDho=
          - EXECUTOR=Celery
        command: worker
    socat:
        image: bpack/socat
        command: TCP4-LISTEN:2375,fork,reuseaddr UNIX-CONNECT:/var/run/docker.sock
        volumes:
          - /var/run/docker.sock:/var/run/docker.sock
        expose:
          - "2375"
Run Code Online (Sandbox Code Playgroud)

触发 DAG 后 Docker 任务的完整错误日志:

*** Log file does not exist: /usr/local/airflow/logs/tutorial/docker_command/2020-04-13T11:20:41.323461+00:00/1.log
*** Fetching from: http://6f57f4c44662:8793/log/tutorial/docker_command/2020-04-13T11:20:41.323461+00:00/1.log

[2020-04-13 11:20:47,627] {{taskinstance.py:655}} INFO - Dependencies all met for <TaskInstance: tutorial.docker_command 2020-04-13T11:20:41.323461+00:00 [queued]>
[2020-04-13 11:20:47,648] {{taskinstance.py:655}} INFO - Dependencies all met for <TaskInstance: tutorial.docker_command 2020-04-13T11:20:41.323461+00:00 [queued]>
[2020-04-13 11:20:47,648] {{taskinstance.py:866}} INFO - 
--------------------------------------------------------------------------------
[2020-04-13 11:20:47,648] {{taskinstance.py:867}} INFO - Starting attempt 1 of 2
[2020-04-13 11:20:47,648] {{taskinstance.py:868}} INFO - 
--------------------------------------------------------------------------------
[2020-04-13 11:20:47,660] {{taskinstance.py:887}} INFO - Executing <Task(DockerOperator): docker_command> on 2020-04-13T11:20:41.323461+00:00
[2020-04-13 11:20:47,663] {{standard_task_runner.py:53}} INFO - Started process 53 to run task
[2020-04-13 11:20:47,729] {{logging_mixin.py:112}} INFO - Running %s on host %s <TaskInstance: tutorial.docker_command 2020-04-13T11:20:41.323461+00:00 [running]> 6f57f4c44662
[2020-04-13 11:20:47,758] {{taskinstance.py:1128}} ERROR - Error while fetching server API version: ('Connection aborted.', FileNotFoundError(2, 'No such file or directory'))
Traceback (most recent call last):
  File "/usr/local/airflow/.local/lib/python3.7/site-packages/urllib3/connectionpool.py", line 600, in urlopen
    chunked=chunked)
  File "/usr/local/airflow/.local/lib/python3.7/site-packages/urllib3/connectionpool.py", line 354, in _make_request
    conn.request(method, url, **httplib_request_kw)
  File "/usr/local/lib/python3.7/http/client.py", line 1252, in request
    self._send_request(method, url, body, headers, encode_chunked)
  File "/usr/local/lib/python3.7/http/client.py", line 1298, in _send_request
    self.endheaders(body, encode_chunked=encode_chunked)
  File "/usr/local/lib/python3.7/http/client.py", line 1247, in endheaders
    self._send_output(message_body, encode_chunked=encode_chunked)
  File "/usr/local/lib/python3.7/http/client.py", line 1026, in _send_output
    self.send(msg)
  File "/usr/local/lib/python3.7/http/client.py", line 966, in send
    self.connect()
  File "/usr/local/airflow/.local/lib/python3.7/site-packages/docker/transport/unixconn.py", line 42, in connect
    sock.connect(self.unix_socket)
FileNotFoundError: [Errno 2] No such file or directory

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/airflow/.local/lib/python3.7/site-packages/requests/adapters.py", line 449, in send
    timeout=timeout
  File "/usr/local/airflow/.local/lib/python3.7/site-packages/urllib3/connectionpool.py", line 638, in urlopen
    _stacktrace=sys.exc_info()[2])
  File "/usr/local/airflow/.local/lib/python3.7/site-packages/urllib3/util/retry.py", line 368, in increment
    raise six.reraise(type(error), error, _stacktrace)
  File "/usr/local/airflow/.local/lib/python3.7/site-packages/urllib3/packages/six.py", line 685, in reraise
    raise value.with_traceback(tb)
  File "/usr/local/airflow/.local/lib/python3.7/site-packages/urllib3/connectionpool.py", line 600, in urlopen
    chunked=chunked)
  File "/usr/local/airflow/.local/lib/python3.7/site-packages/urllib3/connectionpool.py", line 354, in _make_request
    conn.request(method, url, **httplib_request_kw)
  File "/usr/local/lib/python3.7/http/client.py", line 1252, in request
    self._send_request(method, url, body, headers, encode_chunked)
  File "/usr/local/lib/python3.7/http/client.py", line 1298, in _send_request
    self.endheaders(body, encode_chunked=encode_chunked)
  File "/usr/local/lib/python3.7/http/client.py", line 1247, in endheaders
    self._send_output(message_body, encode_chunked=encode_chunked)
  File "/usr/local/lib/python3.7/http/client.py", line 1026, in _send_output
    self.send(msg)
  File "/usr/local/lib/python3.7/http/client.py", line 966, in send
    self.connect()
  File "/usr/local/airflow/.local/lib/python3.7/site-packages/docker/transport/unixconn.py", line 42, in connect
    sock.connect(self.unix_socket)
urllib3.exceptions.ProtocolError: ('Connection aborted.', FileNotFoundError(2, 'No such file or directory'))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/airflow/.local/lib/python3.7/site-packages/docker/api/client.py", line 202, in _retrieve_server_version
    return self.version(api_version=False)["ApiVersion"]
  File "/usr/local/airflow/.local/lib/python3.7/site-packages/docker/api/daemon.py", line 181, in version
    return self._result(self._get(url), json=True)
  File "/usr/local/airflow/.local/lib/python3.7/site-packages/docker/utils/decorators.py", line 46, in inner
    return f(self, *args, **kwargs)
  File "/usr/local/airflow/.local/lib/python3.7/site-packages/docker/api/client.py", line 225, in _get
    return self.get(url, **self._set_request_timeout(kwargs))
  File "/usr/local/airflow/.local/lib/python3.7/site-packages/requests/sessions.py", line 546, in get
    return self.request('GET', url, **kwargs)
  File "/usr/local/airflow/.local/lib/python3.7/site-packages/requests/sessions.py", line 533, in request
    resp = self.send(prep, **send_kwargs)
  File "/usr/local/airflow/.local/lib/python3.7/site-packages/requests/sessions.py", line 646, in send
    r = adapter.send(request, **kwargs)
  File "/usr/local/airflow/.local/lib/python3.7/site-packages/requests/adapters.py", line 498, in send
    raise ConnectionError(err, request=request)
requests.exceptions.ConnectionError: ('Connection aborted.', FileNotFoundError(2, 'No such file or directory'))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.7/site-packages/airflow/models/taskinstance.py", line 966, in _run_raw_task
    result = task_copy.execute(context=context)
  File "/usr/local/lib/python3.7/site-packages/airflow/operators/docker_operator.py", line 262, in execute
    tls=tls_config
  File "/usr/local/airflow/.local/lib/python3.7/site-packages/docker/api/client.py", line 185, in __init__
    self._version = self._retrieve_server_version()
  File "/usr/local/airflow/.local/lib/python3.7/site-packages/docker/api/client.py", line 210, in _retrieve_server_version
    'Error while fetching server API version: {0}'.format(e)
docker.errors.DockerException: Error while fetching server API version: ('Connection aborted.', FileNotFoundError(2, 'No such file or directory'))
[2020-04-13 11:20:47,765] {{taskinstance.py:1151}} INFO - Marking task as UP_FOR_RETRY
[2020-04-13 11:20:57,585] {{logging_mixin.py:112}} INFO - [2020-04-13 11:20:57,584] {{local_task_job.py:103}} INFO - Task exited with return code 1
Run Code Online (Sandbox Code Playgroud)

我无法让它工作:/。我可能会添加 -/var/run/docker.sock:/var/run/docker.sock错误的方式吗?

谢谢!

Mor*_*tz 7

对我来说,以下方法可以使其在本地计算机上运行:我从这里获取了官方的 docker-compose.yaml: https: //github.com/apache/airflow/blob/main/docs/apache-airflow/start /docker-compose.yaml

在 x-airflow-common:/volumes 中,我添加了:

- /var/run/docker.sock:/var/run/docker.sock
Run Code Online (Sandbox Code Playgroud)

在 x-airflow-common:/user 中,我将值更改为

user: root
Run Code Online (Sandbox Code Playgroud)

启动气流

docker-compose up airflow-init
docker-compose up
Run Code Online (Sandbox Code Playgroud)

带有 DockerOperator 的 DAG 贯穿


Dmi*_*sky 2

我在 Linux 上遇到了同样的问题,感谢How to mount docker socket asvolume in dockercontainer with正确的组,我解决了它。也许我的解决方案会对您有所帮助。

我对 docker.sock 有下一个权限:

srw-rw---- 1 root docker docker.sock
Run Code Online (Sandbox Code Playgroud)

Dockerfile:

FROM puckel/docker-airflow:latest
USER root
ARG DOCKER_GROUP_ID
# Install Docker
RUN pip install 'Docker==4.2.0'
# Add permissions for running docker.sock
RUN groupadd -g $DOCKER_GROUP_ID docker && gpasswd -a airflow docker
USER airflow
Run Code Online (Sandbox Code Playgroud)

使用命令构建图像:

docker build --rm --build-arg DOCKER_GROUP_ID=`getent group docker | cut -d: -f3` -t docker-airflow .
Run Code Online (Sandbox Code Playgroud)

并运行容器:

docker run -d -p 8080:8080 -v /var/run/docker.sock://var/run/docker.sock -v /path/to/dags/on/your/local/machine/:/usr/local/airflow/dags docker-airflow webserver
Run Code Online (Sandbox Code Playgroud)

  • 有没有一种等效的方法可以直接从他们的“curl -LfO 'https://airflow.apache.org/docs/apache-airflow/2.1.0/docker-compose.yaml'”中使用 docker-compose.yaml 来执行此操作网站:https://airflow.apache.org/docs/apache-airflow/stable/start/docker.html。我使用的是他们的 yaml 文件而不是 puckels。当我尝试运行“dockerOperator”时,我收到了这个确切的错误。我正在运行 Ubuntu 20.04,我注意到我的 docker.socket 位于 `/run/docker.socket` 中;当我输入“systemctl status docker.socket”时。你对此有何看法?谢谢! (3认同)