tra*_*ant 5 python amazon-s3 aws-lambda
使用以下代码将大文件从 S3 (>5GB) 读取到 lambda 中:
import json
import boto3
s3 = boto3.client('s3')
def lambda_handler(event, context):
response = s3.get_object(
Bucket="my-bucket",
Key="my-key"
)
text_bytes = response['Body'].read()
...
return {
'statusCode': 200,
'body': json.dumps('Hello from Lambda!')
}
Run Code Online (Sandbox Code Playgroud)
但是我收到以下错误:
"errorMessage": "signed integer is greater than maximum"
"errorType": "OverflowError"
"stackTrace": [
" File \"/var/task/lambda_function.py\", line 13, in lambda_handler\n text_bytes = response['Body'].read()\n"
" File \"/var/runtime/botocore/response.py\", line 77, in read\n chunk = self._raw_stream.read(amt)\n"
" File \"/var/runtime/urllib3/response.py\", line 515, in read\n data = self._fp.read() if not fp_closed else b\"\"\n"
" File \"/var/lang/lib/python3.8/http/client.py\", line 472, in read\n s = self._safe_read(self.length)\n"
" File \"/var/lang/lib/python3.8/http/client.py\", line 613, in _safe_read\n data = self.fp.read(amt)\n"
" File \"/var/lang/lib/python3.8/socket.py\", line 669, in readinto\n return self._sock.recv_into(b)\n"
" File \"/var/lang/lib/python3.8/ssl.py\", line 1241, in recv_into\n return self.read(nbytes, buffer)\n"
" File \"/var/lang/lib/python3.8/ssl.py\", line 1099, in read\n return self._sslobj.read(len, buffer)\n"
]
Run Code Online (Sandbox Code Playgroud)
我正在使用 Python 3.8,我在这里发现了 Python 3.8/9 的一个问题,这可能就是原因:https ://bugs.python.org/issue42853
有没有办法解决?
正如您链接到的错误中提到的,Python 3.8 的核心问题是一次读取超过 1GB 的错误。您可以使用错误中建议的解决方法的变体来分块读取文件。
import boto3
s3 = boto3.client('s3')
def lambda_handler(event, context):
response = s3.get_object(
Bucket="-example-bucket-",
Key="path/to/key.dat"
)
buf = bytearray(response['ContentLength'])
view = memoryview(buf)
pos = 0
while True:
chunk = response['Body'].read(67108864)
if len(chunk) == 0:
break
view[pos:pos+len(chunk)] = chunk
pos += len(chunk)
return {
'statusCode': 200,
'body': json.dumps('Hello from Lambda!')
}
Run Code Online (Sandbox Code Playgroud)
然而,在每次 Lambda 运行中,您充其量只会花费一分钟或更长时间来从 S3 读取数据。如果您可以将文件存储在 EFS 中并从 Lambda 中读取它,或者使用 ECS 等其他解决方案来避免从远程数据源读取,那就更好了。
| 归档时间: |
|
| 查看次数: |
6977 次 |
| 最近记录: |