alp*_*787 6 python django celery libreoffice docker
我真的疯了,拔掉我的头发,因为我似乎无法解决这个特殊的问题.
所以这就是问题:我有两个容器:Django和芹菜.用户上传word文档,芹菜工作者将该word文档转换为pdf并上传到s3存储桶.我正在用libreoffice --headless它来转换它.因此,用户将文件发送到API端点,并将word文档保存在名为originalcelery 的文件夹中,convert_office_to_pdf.delay该文件需要转换文件并将其放入另一个文件夹中converted.除芹菜功能外,一切都按预期工作.这是代码的样子:
import subprocess
def convert_office_to_pdf(original_file):
ws = websocket.WebSocket()
ws.connect('ws://web:8000/ws/converter/public/')
#how the command will look like
print('libreoffice --headless --convert-to pdf original/{} --outdir ./converted'.format(original_file))
subprocess.call('libreoffice --headless --convert-to pdf original/{} --outdir ./converted'.format(original_file), shell=True)
ws.send(json.dumps({
'message': '{}.pdf'.format(pure_file_name),
'progress': 75}))
upload_file_to_s3(pure_file_name, 'pdf', ws)
Run Code Online (Sandbox Code Playgroud)
但是,函数get已执行且没有任何反应.这是输出docker-compose
web_1 | [2018/03/22 22:57:52] HTTP GET /converter/ 200 [0.06, 172.17.0.1:32788]
web_1 | [2018/03/22 22:57:52] HTTP GET /static/css/normalize.css 304 [0.02, 172.17.0.1:32788]
web_1 | [2018/03/22 22:57:52] WebSocket HANDSHAKING /ws/converter/public/ [172.17.0.1:32798]
web_1 | [2018/03/22 22:57:52] WebSocket CONNECT /ws/converter/public/ [172.17.0.1:32798]
fileshiffty_data_1 exited with code 0
worker_1 | [2018-03-22 22:58:04,413: INFO/MainProcess] Received task: api.tasks.convert_office_to_pdf[287805aa-3c9c-4212-92d4-cac5872076f2]
worker_1 | [2018-03-22 22:58:04,414: DEBUG/MainProcess] TaskPool: Apply <function _fast_trace_task at 0x7fb72d567e18> (args:('api.tasks.convert_office_to_pdf', '287805aa-3c9c-4212-92d4-cac5872076f2', {'lang': 'py', 'task': 'api.tasks.convert_office_to_pdf', 'id': '287805aa-3c9c-4212-92d4-cac5872076f2', 'eta': None, 'expires': None, 'group': None, 'retries': 0, 'timelimit': [None, None], 'root_id': '287805aa-3c9c-4212-92d4-cac5872076f2', 'parent_id': None, 'argsrepr': "('1521759484.3458297-Doc1.docx',)", 'kwargsrepr': '{}', 'origin': 'gen8@a478d8966021', 'reply_to': 'adf32365-ef93-327e-842f-7eff10fda37a', 'correlation_id': '287805aa-3c9c-4212-92d4-cac5872076f2', 'delivery_info': {'exchange': '', 'routing_key': 'celery', 'priority': 0, 'redelivered': None}}, b'[["1521759484.3458297-Doc1.docx"], {}, {"callbacks": null, "errbacks": null, "chain": null, "chord": null}]', 'application/json', 'utf-8') kwargs:{})
web_1 | [2018/03/22 22:58:04] HTTP PUT /api/v1/fileupload/word/pdf/ 200 [0.07, 172.17.0.1:32788]
worker_1 | [2018-03-22 22:58:04,417: DEBUG/MainProcess] Task accepted: api.tasks.convert_office_to_pdf[287805aa-3c9c-4212-92d4-cac5872076f2] pid:9
web_1 | [2018/03/22 22:58:04] WebSocket HANDSHAKING /ws/converter/public/ [172.17.0.2:58928]
web_1 | [2018/03/22 22:58:04] WebSocket CONNECT /ws/converter/public/ [172.17.0.2:58928]
worker_1 | [2018-03-22 22:58:04,426: WARNING/ForkPoolWorker-2] /data/web/fileshiffty
worker_1 | [2018-03-22 22:58:04,427: WARNING/ForkPoolWorker-2] libreoffice --headless --convert-to pdf original/1521759484.3458297-Doc1.docx --outdir ./converted
web_1 | {"message": "1521759484.3458297-Doc1.pdf", "progress": 50}
web_1 | {"message": "1521759484.3458297-Doc1.pdf", "progress": 75}
Run Code Online (Sandbox Code Playgroud)
当我上传文件时,我可以确认文件已添加到original文件夹中,日志条目worker_1 | [2018-03-22 22:58:04,427: WARNING/ForkPoolWorker-2] libreoffice --headless --convert-to pdf original/1521759484.3458297-Doc1.docx --outdir ./converted显示subprocess将调用的命令.但是,当我查看converted文件夹内部时,我什么也看不见.它完全是空的.然而,奇怪的部分是当我打入docker容器并运行SAME EXACT时,文件被转换并放入文件夹.像这样
root@4b9da6f71226:/data/web/fileshiffty/api# python3
Python 3.6.4 (default, Mar 14 2018, 17:49:05)
[GCC 4.9.2] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import subprocess
>>> subprocess.call('libreoffice --headless --convert-to pdf original/1521759484.3458297-Doc1.docx --outdir ./converted', shell=True)
convert /data/web/fileshiffty/api/original/1521759484.3458297-Doc1.docx -> /data/web/fileshiffty/api/converted/1521759484.3458297-Doc1.pdf using writer_pdf_Export
0
Run Code Online (Sandbox Code Playgroud)
为什么我在bash in并执行它运行的子进程但不是从文件执行.有人可以帮帮我吗?
编辑.似乎subprocess命令似乎没有被执行.我将代码更改为以下内容,以找出subprocess命令后发生的情况,甚至使用绝对路径,如下所示:
def convert_office_to_pdf(original_file):
ws = websocket.WebSocket()
ws.connect('ws://web:8000/ws/converter/public/')
pure_file_name = os.path.splitext(os.path.basename(original_file))[0]
ws.send(json.dumps({
'message': '{}.pdf'.format(pure_file_name),
'progress': 50}))
print(os.getcwd())
print('libreoffice --headless --convert-to pdf original/{} --outdir ./converted'.format(original_file))
command = ['libreoffice', '--headless', '--convert-to', 'pdf', '{}/original/{}'.format(os.getcwd(), original_file), '--outdir', '{}/converted'.format(os.getcwd())]
process = subprocess.Popen(command, stdout=subprocess.PIPE)
out, err = process.communicate()
print(out)
print(err)
print('------------------------------------------------')
ws.send(json.dumps({
'message': '{}.pdf'.format(pure_file_name),
'progress': 75}))
upload_file_to_s3(pure_file_name, 'pdf', ws)
Run Code Online (Sandbox Code Playgroud)
我得到以下输出
[2018-03-22 23:44:54,668: DEBUG/MainProcess] Task accepted: api.tasks.convert_office_to_pdf[721ed2db-6a74-4fd2-9484-0fca14df7c01] pid:9
web_1 | [2018/03/22 23:44:54] WebSocket HANDSHAKING /ws/converter/public/ [172.17.0.2:60898]
web_1 | [2018/03/22 23:44:54] WebSocket CONNECT /ws/converter/public/ [172.17.0.2:60898]
worker_1 | [2018-03-22 23:44:54,696: WARNING/ForkPoolWorker-2] /data/web/fileshiffty
worker_1 | [2018-03-22 23:44:54,696: WARNING/ForkPoolWorker-2] libreoffice --headless --convert-to pdf original/1521762293.8511283-Doc1.docx --outdir ./converted
web_1 | {"message": "1521762293.8511283-Doc1.pdf", "progress": 50}
worker_1 | [2018-03-22 23:44:55,283: WARNING/ForkPoolWorker-2] b''
worker_1 | [2018-03-22 23:44:55,283: WARNING/ForkPoolWorker-2] None
worker_1 | [2018-03-22 23:44:55,283: WARNING/ForkPoolWorker-2] ------------------------------------------------
web_1 | {"message": "1521762293.8511283-Doc1.pdf", "progress": 75}
Run Code Online (Sandbox Code Playgroud)
print(out)只打印一个空白字节print(err),只打印无.
编辑2 - 这是docker-compose文件
web:
restart: always
tty: true
build: ./web/
working_dir: /data/web/fileshiffty
expose:
- "8000"
ports:
- "8000:8000"
links:
- postgres:postgres
- redis:redis
env_file: env
volumes:
- ./web:/data/web
command: bash -c "python3 manage.py runserver 0.0.0.0:8000"
# command: /usr/bin/gunicorn fileshiffty.wsgi:application -w 2 -b :8000
nginx:
restart: always
build: ./nginx/
ports:
- "80:80"
volumes_from:
- web
links:
- web:web
postgres:
restart: always
image: postgres:latest
volumes_from:
- data
volumes:
- ./postgres/docker-entrypoint-initdb.d:/docker-entrypoint-initdb.d
- ./backups/postgresql:/backup
env_file:
- env
expose:
- "5432"
redis:
restart: always
image: redis:latest
expose:
- "6379"
worker:
build: ./web/
working_dir: /data/web/fileshiffty
command: bash -c "celery -A fileshiffty worker --loglevel=DEBUG"
volumes:
- ./web:/data/web
links:
- postgres:postgres
- redis:redis
- web:web
data:
restart: always
image: alpine
volumes:
- /var/lib/postgresql
command: "true"
Run Code Online (Sandbox Code Playgroud)
小智 0
几个可能的原因:
仅当多个用户调用您的 Web API 时才会发生这种情况吗libreoffice?如果是这样,您需要确保每个并发libreoffice进程都有自己独立的用户安装目录。您可以使用 来设置自定义的libreoffice -env:UserInstallation=file:///tmp/test。
如果您的模型是libreoffice提前启动一个流程,因此后续libreoffice流程只需将请求转发给已经启动的工作人员,您使用什么版本的 LibreOffice?例如,6.1 行有一个错误,我们没有等待转换结果,请参阅https://gerrit.libreoffice.org/#/c/66168/进行修复。(关于对话框有一个版本字符串和一个精确的 git 哈希值。因此 6.1.5 已经有此修复,但 6.1.4 没有。)