Python请求发布的进展

Rob*_*bie 12 python python-requests

我正在使用Python请求包上传一个大文件,我找不到任何方式来提供有关上传进度的数据.我已经看到了许多用于下载文件的进度表,但是这些对于文件上传不起作用.

理想的解决方案是某种回调方法,例如:

def progress(percent):
  print percent
r = requests.post(URL, files={'f':hugeFileHandle}, callback=progress)
Run Code Online (Sandbox Code Playgroud)

在此先感谢您的帮助 :)

jfs*_*jfs 12

requests 不支持 上传 流媒体,例如:

import os
import sys
import requests  # pip install requests

class upload_in_chunks(object):
    def __init__(self, filename, chunksize=1 << 13):
        self.filename = filename
        self.chunksize = chunksize
        self.totalsize = os.path.getsize(filename)
        self.readsofar = 0

    def __iter__(self):
        with open(self.filename, 'rb') as file:
            while True:
                data = file.read(self.chunksize)
                if not data:
                    sys.stderr.write("\n")
                    break
                self.readsofar += len(data)
                percent = self.readsofar * 1e2 / self.totalsize
                sys.stderr.write("\r{percent:3.0f}%".format(percent=percent))
                yield data

    def __len__(self):
        return self.totalsize

# XXX fails
r = requests.post("http://httpbin.org/post",
                  data=upload_in_chunks(__file__, chunksize=10))
Run Code Online (Sandbox Code Playgroud)

顺便说一下,如果你不需要报告进度; 您可以使用内存映射文件上传大文件.

要解决此问题,您可以创建一个类似于urllib2 POST进度监控的文件适配器 :

class IterableToFileAdapter(object):
    def __init__(self, iterable):
        self.iterator = iter(iterable)
        self.length = len(iterable)

    def read(self, size=-1): # TBD: add buffer for `len(data) > size` case
        return next(self.iterator, b'')

    def __len__(self):
        return self.length
Run Code Online (Sandbox Code Playgroud)

it = upload_in_chunks(__file__, 10)
r = requests.post("http://httpbin.org/post", data=IterableToFileAdapter(it))

# pretty print
import json
json.dump(r.json, sys.stdout, indent=4, ensure_ascii=False)
Run Code Online (Sandbox Code Playgroud)

  • kennethreitz于2013年1月10日发表评论:完成.https://github.com/kennethreitz/requests/issues/952 (2认同)

der*_*och 11

我从这里得到了代码:PyQt中的简单文件上传进度条.我稍微改了一下,使用BytesIO而不是StringIO.

class CancelledError(Exception):
    def __init__(self, msg):
        self.msg = msg
        Exception.__init__(self, msg)

    def __str__(self):
        return self.msg

    __repr__ = __str__

class BufferReader(BytesIO):
    def __init__(self, buf=b'',
                 callback=None,
                 cb_args=(),
                 cb_kwargs={}):
        self._callback = callback
        self._cb_args = cb_args
        self._cb_kwargs = cb_kwargs
        self._progress = 0
        self._len = len(buf)
        BytesIO.__init__(self, buf)

    def __len__(self):
        return self._len

    def read(self, n=-1):
        chunk = BytesIO.read(self, n)
        self._progress += int(len(chunk))
        self._cb_kwargs.update({
            'size'    : self._len,
            'progress': self._progress
        })
        if self._callback:
            try:
                self._callback(*self._cb_args, **self._cb_kwargs)
            except: # catches exception from the callback
                raise CancelledError('The upload was cancelled.')
        return chunk


def progress(size=None, progress=None):
    print("{0} / {1}".format(size, progress))


files = {"upfile": ("file.bin", open("file.bin", 'rb').read())}

(data, ctype) = requests.packages.urllib3.filepost.encode_multipart_formdata(files)

headers = {
    "Content-Type": ctype
}

body = BufferReader(data, progress)
requests.post(url, data=body, headers=headers)
Run Code Online (Sandbox Code Playgroud)

诀窍是,使用urllib3中的encode_multipart_formdata()手动生成文件列表中的数据和标题


小智 10

我建议使用名为requests-toolbelt的工具包,这样可以非常轻松地监控上传字节

from requests_toolbelt import MultipartEncoder, MultipartEncoderMonitor
import requests

def my_callback(monitor):
    # Your callback function
    print monitor.bytes_read

e = MultipartEncoder(
    fields={'field0': 'value', 'field1': 'value',
            'field2': ('filename', open('file.py', 'rb'), 'text/plain')}
    )
m = MultipartEncoderMonitor(e, my_callback)

r = requests.post('http://httpbin.org/post', data=m,
                  headers={'Content-Type': m.content_type})
Run Code Online (Sandbox Code Playgroud)

您可能需要阅读此内容以显示进度条.


小智 7

我知道这是一个老问题,但我在其他地方找不到简单的答案,所以希望这对其他人有帮助:

import requests
import tqdm    
with open(file_name, 'rb') as f:
        r = requests.post(url, data=tqdm(f.readlines()))
Run Code Online (Sandbox Code Playgroud)


Gle*_*son 6

该解决方案使用requests_toolbelttq​​dm两个维护良好且流行的库。

from pathlib import Path
from tqdm import tqdm

import requests
from requests_toolbelt import MultipartEncoder, MultipartEncoderMonitor

def upload_file(upload_url, fields, filepath):

    path = Path(filepath)
    total_size = path.stat().st_size
    filename = path.name

    with tqdm(
        desc=filename,
        total=total_size,
        unit="B",
        unit_scale=True,
        unit_divisor=1024,
    ) as bar:
        with open(filepath, "rb") as f:
            fields["file"] = ("filename", f)
            e = MultipartEncoder(fields=fields)
            m = MultipartEncoderMonitor(
                e, lambda monitor: bar.update(monitor.bytes_read - bar.n)
            )
            headers = {"Content-Type": m.content_type}
            requests.post(upload_url, data=m, headers=headers)
Run Code Online (Sandbox Code Playgroud)

用法示例

upload_url = 'https://uploadurl'
fields = {
  "field1": value1, 
  "field2": value2
}
filepath = '97a6fce8_owners_2018_Van Zandt.csv'

upload_file(upload_url, fields, filepath)
Run Code Online (Sandbox Code Playgroud)

演示