使用请求通过http下载文件时进度条

Question

使用请求通过http下载文件时进度条

我需要下载一个相当大的(~200MB)文件.我想出了如何在这里下载和保存文件.有一个进度条可以知道下载了多少,这将是一件好事.我找到了ProgressBar,但我不确定如何将两者合并在一起.

这是我尝试过的代码,但它没有用.

bar = progressbar.ProgressBar(max_value=progressbar.UnknownLength)
with closing(download_file()) as r:
    for i in range(20):
        bar.update(i)

Run Code Online (Sandbox Code Playgroud)

Answer 1

leo*_*ovp 62

我建议你试试tqdm[1],它很容易使用.下载requests库[2]的示例代码:

from tqdm import tqdm
import requests

url = "http://www.ovh.net/files/10Mb.dat" #big file test
# Streaming, so we can iterate over the response.
r = requests.get(url, stream=True)
# Total size in bytes.
total_size = int(r.headers.get('content-length', 0))
block_size = 1024 #1 Kibibyte
t=tqdm(total=total_size, unit='iB', unit_scale=True)
with open('test.dat', 'wb') as f:
    for data in r.iter_content(block_size):
        t.update(len(data))
        f.write(data)
t.close()
if total_size != 0 and t.n != total_size:
    print("ERROR, something went wrong")

Run Code Online (Sandbox Code Playgroud)

[1]:https://github.com/tqdm/tqdm
[2]:http://docs.python-requests.org/en/master/

这很安静很好!唯一的注释是total_size =(int(r.headers.get('content-length',0))/(32*1024)).这是因为请求一次获得32*1024字节而不是1字节. (5认同)
我将`with open('output.bin','wb')添加为f:with tqdm(total = total_size /(32*1024.0),unit ='B',unit_scale = True,unit_divisor = 1024)as pbar :对于r.iter_content(32*1024)中的数据:f.write(data); pbar.update(LEN(数据))` (3认同)

Answer 2

Mik*_*ike 31

该tqdm软件包现在包含一个专门针对此类情况设计的功能：wrapattr。您只需包装对象的read(或write) 属性，tqdm 就会处理其余的事情；没有混乱的块大小或类似的东西。这是一个简单的下载功能，将其全部组合在一起requests：

def download(url, filename):
    import functools
    import pathlib
    import shutil
    import requests
    from tqdm.auto import tqdm
    
    r = requests.get(url, stream=True, allow_redirects=True)
    if r.status_code != 200:
        r.raise_for_status()  # Will only raise for 4xx codes, so...
        raise RuntimeError(f"Request to {url} returned status code {r.status_code}")
    file_size = int(r.headers.get('Content-Length', 0))

    path = pathlib.Path(filename).expanduser().resolve()
    path.parent.mkdir(parents=True, exist_ok=True)

    desc = "(Unknown total file size)" if file_size == 0 else ""
    r.raw.read = functools.partial(r.raw.read, decode_content=True)  # Decompress if needed
    with tqdm.wrapattr(r.raw, "read", total=file_size, desc=desc) as r_raw:
        with path.open("wb") as f:
            shutil.copyfileobj(r_raw, f)

    return path

Run Code Online (Sandbox Code Playgroud)

Answer 3

Art*_*oul 6

还可以使用python库enlight ，它功能强大，提供彩色进度条，并且可以在Linux、Windows下正确运行。

下面是代码+实时截屏。该代码可以在 repl.it 上运行。

import math
import requests, enlighten

url = 'https://upload.wikimedia.org/wikipedia/commons/a/ae/Arthur_Streeton_-_Fire%27s_on_-_Google_Art_Project.jpg?download'
fname = 'image.jpg'

# Should be one global variable
MANAGER = enlighten.get_manager()

r = requests.get(url, stream = True)
assert r.status_code == 200, r.status_code
dlen = int(r.headers.get('Content-Length', '0')) or None

with MANAGER.counter(color = 'green', total = dlen and math.ceil(dlen / 2 ** 20), unit = 'MiB', leave = False) as ctr, \
     open(fname, 'wb', buffering = 2 ** 24) as f:
    for chunk in r.iter_content(chunk_size = 2 ** 20):
        print(chunk[-16:].hex().upper())
        f.write(chunk)
        ctr.update()

Run Code Online (Sandbox Code Playgroud)

输出（+ ascii-video）

Answer 4

and*_*rew 5

Progress Bar Usage页面上的示例与代码实际需要的内容之间似乎存在脱节。

在以下示例中，请注意使用maxval代替max_value。还要注意.start()初始化栏的使用。这已在一个问题中指出。

该n_chunk参数表示在循环遍历请求迭代器时一次流式传输多少个 1024 kb 块。

import requests
import time

import numpy as np

import progressbar


url = "http://wikipedia.com/"

def download_file(url, n_chunk=1):
    r = requests.get(url, stream=True)
    # Estimates the number of bar updates
    block_size = 1024
    file_size = int(r.headers.get('Content-Length', None))
    num_bars = np.ceil(file_size / (n_chunk * block_size))
    bar =  progressbar.ProgressBar(maxval=num_bars).start()
    with open('test.html', 'wb') as f:
        for i, chunk in enumerate(r.iter_content(chunk_size=n_chunk * block_size)):
            f.write(chunk)
            bar.update(i+1)
            # Add a little sleep so you can see the bar progress
            time.sleep(0.05)
    return

download_file(url)

Run Code Online (Sandbox Code Playgroud)

编辑：解决了关于代码清晰度的评论。
EDIT2：固定逻辑，所以 bar 在完成时报告 100%。感谢leovp的答案使用1024 KB的块大小。

归档时间：	9 年，5 月前
查看次数：	21951 次
最近记录：	6 年前