Mai*_*aik 7 python docker airflow airflow-scheduler
我有一个这样的文件夹树 project
我在 docker 容器中创建了一个气流服务:
dockerfile
#Base image
FROM puckel/docker-airflow:1.10.1
#Impersonate
USER root
#Los automatically thrown to the I/O strem and not buffered.
ENV PYTHONUNBUFFERED 1
ENV AIRFLOW_HOME=/usr/local/airflow
ENV PYTHONPATH "${PYTHONPATH}:/libraries"
WORKDIR /
#Add docker source files to the docker machine
ADD ./docker_resources ./docker_resources
#Install libraries and dependencies
RUN apt-get update && apt-get install -y vim
RUN pip install --user psycopg2-binary
RUN pip install -r docker_resources/requirements.pip
Docker-compose.yml
version: '3'
services:
postgres:
image: postgres:9.6
container_name: "postgres"
environment:
- POSTGRES_USER=airflow
- POSTGRES_PASSWORD=airflow
- POSTGRES_DB=airflow
ports:
- "5432:5432"
webserver:
build: .
restart: always
depends_on:
- postgres
volumes:
- ./dags:/usr/local/airflow/dags
- ./libraries:/libraries
- ./python_scripts:/python_scripts
ports:
- "8080:8080"
command: webserver
healthcheck:
test: ["CMD-SHELL", "[ -f /usr/local/airflow/airflow-webserver.pid ]"]
interval: 30s
timeout: 30s
retries: 3
scheduler:
build: .
restart: always
depends_on:
- postgres
volumes:
- ./dags:/usr/local/airflow/dags
- ./logs:/usr/local/airflow/logs
ports:
- "8793:8793"
command: scheduler
healthcheck:
test: ["CMD-SHELL", "[ -f /usr/local/airflow/airflow-scheduler.pid ]"]
interval: 30s
timeout: 30s
retries: 3
Run Code Online (Sandbox Code Playgroud)
我的 dag 文件夹有一个教程:
dockerfile
#Base image
FROM puckel/docker-airflow:1.10.1
#Impersonate
USER root
#Los automatically thrown to the I/O strem and not buffered.
ENV PYTHONUNBUFFERED 1
ENV AIRFLOW_HOME=/usr/local/airflow
ENV PYTHONPATH "${PYTHONPATH}:/libraries"
WORKDIR /
#Add docker source files to the docker machine
ADD ./docker_resources ./docker_resources
#Install libraries and dependencies
RUN apt-get update && apt-get install -y vim
RUN pip install --user psycopg2-binary
RUN pip install -r docker_resources/requirements.pip
Docker-compose.yml
version: '3'
services:
postgres:
image: postgres:9.6
container_name: "postgres"
environment:
- POSTGRES_USER=airflow
- POSTGRES_PASSWORD=airflow
- POSTGRES_DB=airflow
ports:
- "5432:5432"
webserver:
build: .
restart: always
depends_on:
- postgres
volumes:
- ./dags:/usr/local/airflow/dags
- ./libraries:/libraries
- ./python_scripts:/python_scripts
ports:
- "8080:8080"
command: webserver
healthcheck:
test: ["CMD-SHELL", "[ -f /usr/local/airflow/airflow-webserver.pid ]"]
interval: 30s
timeout: 30s
retries: 3
scheduler:
build: .
restart: always
depends_on:
- postgres
volumes:
- ./dags:/usr/local/airflow/dags
- ./logs:/usr/local/airflow/logs
ports:
- "8793:8793"
command: scheduler
healthcheck:
test: ["CMD-SHELL", "[ -f /usr/local/airflow/airflow-scheduler.pid ]"]
interval: 30s
timeout: 30s
retries: 3
Run Code Online (Sandbox Code Playgroud)
我尝试更改bash_command='python /python_scripts/my_script.py',
为:
bash_command='python python_scripts/my_script.py',bash_command='python ~/../python_scripts/my_script.py',bash_command='python ~/python_scripts/my_script.py',他们都失败了。我尝试了它们,因为BashOperator在tmp文件夹中运行命令。如果我进入机器并运行ls命令,我会在python_scripts. 即使我跑python /python_scripts/my_script.py从/usr/local/airflow它的工作原理。
错误总是:
信息-python:无法打开文件
我搜索过,人们用绝对路径解决了这个问题,但我无法解决。
编辑
如果在我在ADD ./ ./下面添加的 dockerfile中WORKDIR /
并从以下位置删除这些卷docker-compose.yml:
1. ./libraries:/libraries
2. ./python_scripts:/python_scripts
Run Code Online (Sandbox Code Playgroud)
错误不是找不到文件,是未找到库。Import module error. 这是一个改进,但没有意义,因为PYTHONPATH定义了/libraries文件夹。
ADD语句的数量更有意义,因为我需要将更改立即应用到 docker 中的代码中。
编辑 2: 卷已安装,但容器文件夹中没有文件,这就是无法找到文件的原因。运行 Add ./ ./ 时,文件夹中有文件,因为那里添加了文件夹中的所有文件。尽管它不起作用,但也找不到库。
最后我解决了这个问题,我放弃了之前的所有工作,并DOCKERFILE使用UBUNTU基础图像重新启动,而不是puckel/docker-airflow基于python:3.7-slim-buster.
我不使用 root 不知道的任何其他用户。
| 归档时间: |
|
| 查看次数: |
2188 次 |
| 最近记录: |