Mic*_*ael 7 google-cloud-python
我们正在运行以下代码以并行上传到 GCP Buckets。根据我们看到的警告,我们似乎正在快速耗尽池中的所有连接。有什么方法可以配置库正在使用的连接池吗?
def upload_string_to_bucket(content: str):
blob = bucket.blob(cloud_path)
blob.upload_from_string(content)
with concurrent.futures.ThreadPoolExecutor() as executor:
executor.map(upload_string_to_bucket, content_list)
Run Code Online (Sandbox Code Playgroud)
WARNING:urllib3.connectionpool:Connection pool is full, discarding connection: www.googleapis.com
WARNING:urllib3.connectionpool:Connection pool is full, discarding connection: www.googleapis.com
WARNING:urllib3.connectionpool:Connection pool is full, discarding connection: www.googleapis.com
WARNING:urllib3.connectionpool:Connection pool is full, discarding connection: www.googleapis.com
WARNING:urllib3.connectionpool:Connection pool is full, discarding connection: www.googleapis.com
WARNING:urllib3.connectionpool:Connection pool is full, discarding connection: www.googleapis.com
Run Code Online (Sandbox Code Playgroud)
小智 1
我在并行下载 blob 时也遇到类似的问题。
这篇文章可能会提供一些信息。 https://laike9m.com/blog/requests-secret-pool_connections-and-pool_maxsize,89/
就我个人而言,我不认为增加连接拉动是最好的解决方案,我更喜欢按 pool_maxsize 对“下载”进行分块。
def chunker(it: Iterable, chunk_size: int):
chunk = []
for index, item in enumerate(it):
chunk.append(item)
if not (index + 1) % chunk_size:
yield chunk
chunk = []
if chunk:
yield chunk
for chunk in chunker(content_list, 10):
with concurrent.futures.ThreadPoolExecutor() as executor:
executor.map(upload_string_to_bucket, chunk)
Run Code Online (Sandbox Code Playgroud)
当然,我们可以在准备好后立即生成下载,这一切都如我们所愿。
| 归档时间: |
|
| 查看次数: |
1768 次 |
| 最近记录: |