增加连接池大小

Question

增加连接池大小

我们正在运行以下代码以并行上传到 GCP Buckets。根据我们看到的警告，我们似乎正在快速耗尽池中的所有连接。有什么方法可以配置库正在使用的连接池吗？

def upload_string_to_bucket(content: str):
        blob = bucket.blob(cloud_path)
        blob.upload_from_string(content)

with concurrent.futures.ThreadPoolExecutor() as executor:
            executor.map(upload_string_to_bucket, content_list)

Run Code Online (Sandbox Code Playgroud)

WARNING:urllib3.connectionpool:Connection pool is full, discarding connection: www.googleapis.com
WARNING:urllib3.connectionpool:Connection pool is full, discarding connection: www.googleapis.com
WARNING:urllib3.connectionpool:Connection pool is full, discarding connection: www.googleapis.com
WARNING:urllib3.connectionpool:Connection pool is full, discarding connection: www.googleapis.com
WARNING:urllib3.connectionpool:Connection pool is full, discarding connection: www.googleapis.com
WARNING:urllib3.connectionpool:Connection pool is full, discarding connection: www.googleapis.com

Run Code Online (Sandbox Code Playgroud)

Answer 1

小智 1

我在并行下载 blob 时也遇到类似的问题。

这篇文章可能会提供一些信息。 https://laike9m.com/blog/requests-secret-pool_connections-and-pool_maxsize,89/

就我个人而言，我不认为增加连接拉动是最好的解决方案，我更喜欢按 pool_maxsize 对“下载”进行分块。

def chunker(it: Iterable, chunk_size: int):
    chunk = []
    for index, item in enumerate(it):
        chunk.append(item)
        if not (index + 1) % chunk_size:
            yield chunk
            chunk = []
    if chunk:
        yield chunk

for chunk in chunker(content_list, 10):
    with concurrent.futures.ThreadPoolExecutor() as executor:
        executor.map(upload_string_to_bucket, chunk)

Run Code Online (Sandbox Code Playgroud)

当然，我们可以在准备好后立即生成下载，这一切都如我们所愿。

归档时间：	7 年，1 月前
查看次数：	1768 次
最近记录：	1 年，11 月前