使用boto3时S3连接超时

Pyt*_*ice 15 python amazon-s3 amazon-web-services boto3

我正在使用boto3来操作S3.如果我的应用程序由于网络问题而无法访问S3,则连接将挂起,直到最终超时.我想设置较低的连接超时.我遇到了这个允许设置超时的botocore PR:

$ sudo iptables -A OUTPUT -p tcp --dport 443 -j DROP

from botocore.client import Config
import boto3

config = Config(connect_timeout=5, read_timeout=5)

s3 = boto3.client('s3', config=config)

s3.head_bucket(Bucket='my-s3-bucket') 
Run Code Online (Sandbox Code Playgroud)

这会引发ConnectTimeout,但错误输出仍然需要很长时间:

ConnectTimeout: HTTPSConnectionPool(host='my-s3-bucket.s3.amazonaws.com', port=443): Max retries exceeded with url: / (Caused by ConnectTimeoutError(<botocore.awsrequest.AWSHTTPSConnection object at 0x2ad5dd0>, 'Connection to my-s3-bucket.s3.amazonaws.com timed out. (connect timeout=5)'))
Run Code Online (Sandbox Code Playgroud)

调整连接和读取超时不会影响连接响应的速度.

llu*_*ude 18

您可能会被boto3多次重试连接的默认行为所困扰,并且在两者间以指数方式退出.我有以下好成绩:

from botocore.client import Config
import boto3

config = Config(connect_timeout=5, retries={'max_attempts': 0})
s3 = boto3.client('s3', config=config)
Run Code Online (Sandbox Code Playgroud)

  • 添加错误文本以便人们可以搜索: botocore.exceptions.ClientError:调用 GetObject 操作时发生错误(内部错误)(达到最大重试次数:4):我们遇到内部错误。请再试一次。 (4认同)

EM *_*Bee 2

你解决这个问题了吗?我怀疑您需要 boto 连接的凭据。

以下是我连接 boto3 的方法:

import boto3
from botocore.exceptions import ClientError
import re
from io import BytesIO
import gzip
import datetime
import dateutil.parser as dparser
from datetime import datetime
import tarfile
import requests
import sys
from awsglue.transforms import *
from awsglue.utils import getResolvedOptions
from pyspark.context import SparkContext
from awsglue.context import GlueContext
from awsglue.job import Job

## Needed glue stuff
sc = SparkContext()
glueContext = GlueContext(sc)
spark = glueContext.spark_session
job = Job(glueContext)

## 
## currently this will run for everything that is in the staging directory         of omniture

# set needed parms
myProfileName = 'MyDataLake'
dhiBucket = 'data-lake'
#create boto3 session
try:    
    session = boto3.Session(aws_access_key_id='aaaaaaaaaaaa', aws_secret_access_key='abcdefghijklmnopqrstuvwxyz', region_name='us-east-1', aws_session_token=None, region_name=None, botocore_session=None)
    s3 = session.resource('s3') # establish connection to s3
except Exception as conne:
    print ("Unable to connect:  " + str(conne))
    errtxt = requests.post("https://errorcapturesite", data=    {'message':'Unable to connect to : ' + myProfileName,     'notify':True,'color':'red'})
    print(errtxt.text) 
    exit()
Run Code Online (Sandbox Code Playgroud)

  • 顺便说一句,在代码中存储“aws_access_key_id”和“aws_secret_access_key”变量是一个非常糟糕的主意。肯定会建议将您的凭证存储为环境变量或本地“~/.aws/credentials”。有关详细信息,请参阅以下链接http://boto3.readthedocs.io/en/latest/guide/configuration.html (22认同)