如何使用 python 请求发布 JSON/xml 文件的多部分列表

Hug*_*ans 1 python python-requests

在 python2.7 中,我用来requests与 REST 端点进行通信。我可以向其中上传单个 JSON 和 xml 对象。为了加快速度,我想使用 multipart 上传多个 json 对象。

我有一个curl 命令,它展示了如何完成它并且它是有效的。我需要在 python requests POST 命令中翻译它。

工作卷曲杆:

curl --anyauth --user admin:admin -X POST --data-binary \@sample-body \
     -i -H "Content-type: multipart/mixed; boundary=BOUNDARY" \
     "http://localhost:8058/v1/resources/sight-ingest?rs:transform=aireco-transform&rs:title=file1.xml&rs:title=file2.xml&rs:title=file3.xml"
Run Code Online (Sandbox Code Playgroud)

需要注意的是:我需要发送自定义参数列表,包括“标题”参数列表,不能通过传递字典来做到这一点吗?但我们可以解决这个问题。

我的蟒蛇足迹:

import requests
files = {'file1': ('foo.txt', 'foo\ncontents\n','text/plain'), 
          'file2': ('bar.txt', 'bar contents', 'text/plain'),
          'file3': ('baz.txt', 'baz contents', 'text/plain')}

headers = {'Content-Type': 'multipart/mixed','Content-Disposition': 'attachment','boundary': 'GRENS'}
params={'title':'file1','title':'file2','title':'file2'}
r = requests.Request('POST', 'http://example.com', files=files , headers=headers, params=params)
print r.prepare().url
print r.prepare().headers
print r.prepare().body
Run Code Online (Sandbox Code Playgroud)

给我:

http://example.com/?title=file2
{'boundary': 'GRENS', 'Content-Type': 'multipart/mixed', 'Content-Length': '471', 'Content-Disposition': 'attachment'}
--7f18a6c1b09f42009228f600b0af35fd
Content-Disposition: form-data; name="file3"; filename="baz.txt"
Content-Type: text/plain

baz contents
--7f18a6c1b09f42009228f600b0af35fd
Content-Disposition: form-data; name="file2"; filename="bar.txt"
Content-Type: text/plain

bar contents
--7f18a6c1b09f42009228f600b0af35fd
Content-Disposition: form-data; name="file1"; filename="foo.txt"
Content-Type: text/plain

foo
contents

--7f18a6c1b09f42009228f600b0af35fd--
Run Code Online (Sandbox Code Playgroud)

问题:

  • 标题中的内容似乎没有在正文中使用?
  • 我可以设定自己的界限吗?'GRENS'没有用在身体上?
  • 我可以像curl示例一样传递标题参数列表(具有相同的键)吗?

Mar*_*ers 5

不要使用字典,而是使用(key, value)元组列表作为查询参数:

params = [('title', 'file1'), ('title', 'file2'), ('title', 'file3')]
Run Code Online (Sandbox Code Playgroud)

否则你最终只会得到一把钥匙。

您不应该设置标题Content-Typerequests当您使用该参数时,将为您正确设置files;这样正确的边界也将被包括在内。你永远不应该自己直接设置边界,真的:

params = [('title', 'file1'), ('title', 'file2'), ('title', 'file3')]
r = requests.post('http://example.com', 
                  files=files, headers=headers, params=params)
Run Code Online (Sandbox Code Playgroud)

您可以通过向每个文件元组Content-Disposition添加第四个元素来设置每个文件部分的标头以获取额外的标头,但在您的情况下,您不应该尝试自己设置标头;无论如何它都会被覆盖。

内省准备好的请求对象会给出:

>>> import requests
>>> from pprint import pprint
>>> files = {'file1': ('foo.txt', 'foo\ncontents\n','text/plain'), 
...           'file2': ('bar.txt', 'bar contents', 'text/plain'),
...           'file3': ('baz.txt', 'baz contents', 'text/plain')}
>>> headers = {'Content-Disposition': 'attachment'}
>>> params = [('title', 'file1'), ('title', 'file2'), ('title', 'file3')]
>>> r = requests.Request('POST', 'http://example.com',
...                      files=files, headers=headers, params=params)
>>> prepared = r.prepare()
>>> prepared.url
'http://example.com/?title=file1&title=file2&title=file3'
>>> pprint(dict(prepared.headers))
{'Content-Disposition': 'attachment',
 'Content-Length': '471',
 'Content-Type': 'multipart/form-data; boundary=7312ccd96db94419bf1d97f2c54bbad1'}
>>> print prepared.body
--7312ccd96db94419bf1d97f2c54bbad1
Content-Disposition: form-data; name="file3"; filename="baz.txt"
Content-Type: text/plain

baz contents
--7312ccd96db94419bf1d97f2c54bbad1
Content-Disposition: form-data; name="file2"; filename="bar.txt"
Content-Type: text/plain

bar contents
--7312ccd96db94419bf1d97f2c54bbad1
Content-Disposition: form-data; name="file1"; filename="foo.txt"
Content-Type: text/plain

foo
contents

--7312ccd96db94419bf1d97f2c54bbad1--
Run Code Online (Sandbox Code Playgroud)

如果您绝对必须拥有multipart/mixed而不拥有multipart/form-data,则必须自己构建 POST 主体并从中设置标头。附带的urllib3工具应该能够为您执行此操作:

from requests.packages.urllib3.fields import RequestField
from requests.packages.urllib3.filepost import encode_multipart_formdata

fields = []    
for name, (filename, contents, mimetype) in files.items():
    rf = RequestField(name=name, data=contents,
                      filename=filename)
    rf.make_multipart(content_disposition='attachment', content_type=mimetype)
    fields.append(rf)

post_body, content_type = encode_multipart_formdata(fields)
content_type = ''.join(('multipart/mixed',) + content_type.partition(';')[1:])

headers = {'Content-Type': content_type}
requests.post('http://example.com', data=post_body, headers=headers, params=params)
Run Code Online (Sandbox Code Playgroud)

或者您可以使用该email来执行相同的操作:

from email.mime.multipart import MIMEMultipart
from email.mime.text import MIMEText

body = MIMEMultipart()
for name, (filename, contents, mimetype) in files.items():
    part = MIMEText(contents, _subtype=mimetype.partition('/')[-1], _charset='utf8')
    part.add_header('Content-Disposition', 'attachment', filename=filename)
    body.attach(part)

post_body = body.as_string().partition('\n\n')[-1]
content_type = body['content-type']

headers = {'Content-Type': content_type}
requests.post('http://example.com', data=post_body, headers=headers, params=params)
Run Code Online (Sandbox Code Playgroud)

但请考虑到此方法要求您设置字符集(我假设 JSON 和 XML 为 UTF-8),并且它很可能对内容使用 Base64 编码:

>>> body = MIMEMultipart()
>>> for name, (filename, contents, mimetype) in files.items():
...     part = MIMEText(contents, _subtype=mimetype.partition('/')[-1], _charset='utf8')
...     part.add_header('Content-Disposition', 'attachment', filename=filename)
...     body.attach(part)
... 
>>> post_body = body.as_string().partition('\n\n')[-1]
>>> content_type = body['content-type']
>>> print post_body
--===============1364782689914852112==
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: base64
Content-Disposition: attachment; filename="baz.txt"

YmF6IGNvbnRlbnRz

--===============1364782689914852112==
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: base64
Content-Disposition: attachment; filename="bar.txt"

YmFyIGNvbnRlbnRz

--===============1364782689914852112==
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: base64
Content-Disposition: attachment; filename="foo.txt"

Zm9vCmNvbnRlbnRzCg==

--===============1364782689914852112==--

>>> print content_type
multipart/mixed; boundary="===============1364782689914852112=="
Run Code Online (Sandbox Code Playgroud)