在 AWS 中使用 Python 中的 Lambda 将文件写入 S3

Question

在 AWS 中使用 Python 中的 Lambda 将文件写入 S3

tsk*_*les 4 python amazon-s3 amazon-web-services aws-lambda

在 AWS 中，我尝试使用 Lambda 函数将文件保存到 Python 中的 S3。虽然这可以在我的本地计算机上运行，但我无法让它在 Lambda 中运行。我一整天都在研究这个问题，非常感谢您的帮助。谢谢。

def pdfToTable(PDFfilename, apiKey, fileExt, bucket, key):

    # parsing a PDF using an API
    fileData = (PDFfilename, open(PDFfilename, "rb"))
    files = {"f": fileData}
    postUrl = "https://pdftables.com/api?key={0}&format={1}".format(apiKey, fileExt)
    response = requests.post(postUrl, files=files)
    response.raise_for_status()

    # this code is probably the problem!
    s3 = boto3.resource('s3')
    bucket = s3.Bucket('transportation.manifests.parsed')
    with open('/tmp/output2.csv', 'rb') as data:
        data.write(response.content)
        key = 'csv/' + key
        bucket.upload_fileobj(data, key)

Run Code Online (Sandbox Code Playgroud)

    # FYI, on my own computer, this saves the file
    with open('output.csv', "wb") as f:
        f.write(response.content)

Run Code Online (Sandbox Code Playgroud)

在 S3 中，有一个存储桶，其中包含应保存文件的transportation.manifests.parsed文件夹。csv

的类型response.content是字节。

从AWS来看，上述当前设置的错误是[Errno 2] No such file or directory: '/tmp/output2.csv': FileNotFoundError.事实上，我的目标是以唯一的名称将文件保存到csv文件夹中，所以tmp/output2.csv可能不是最好的方法。有什么指导吗？

另外，我尝试过用wbandw代替rb也无济于事。错误wb是Input <_io.BufferedWriter name='/tmp/output2.csv'> of type: <class '_io.BufferedWriter'> is not supported.文档表明建议使用“rb”，但我不明白为什么会出现这种情况。

另外，我已经尝试过s3_client.put_object(Key=key, Body=response.content, Bucket=bucket)但收到了An error occurred (404) when calling the HeadObject operation: Not Found。

Answer 1

abi*_*son 6

假设Python 3.6。我通常这样做的方法是将字节内容包装在BytesIO包装器中以创建类似文件的对象。并且，根据 boto3 文档，您可以使用the-transfer-manager进行托管传输：

from io import BytesIO
import boto3
s3 = boto3.client('s3')

fileobj = BytesIO(response.content)

s3.upload_fileobj(fileobj, 'mybucket', 'mykey')

Run Code Online (Sandbox Code Playgroud)

如果这不起作用，我会仔细检查所有 IAM 权限是否正确。

归档时间：	8 年，2 月前
查看次数：	23371 次
最近记录：	7 年，8 月前