Ren*_*evi 3 python amazon-s3 boto amazon-web-services
是否可以在Amazon S3存储桶中循环访问文件/密钥,使用Python读取内容并计算行数?
例如:
1. My bucket: "my-bucket-name"
2. File/Key : "test.txt"
Run Code Online (Sandbox Code Playgroud)
我需要遍历文件“ test.txt”并计算原始文件中的行数。
样例代码:
for bucket in conn.get_all_buckets():
if bucket.name == "my-bucket-name":
for file in bucket.list():
#need to count the number lines in each file and print to a log.
Run Code Online (Sandbox Code Playgroud)
使用boto3您可以执行以下操作:
import boto3
# create the s3 resource
s3 = boto3.resource('s3')
# get the file object
obj = s3.Object('bucket_name', 'key')
# read the file contents in memory
file_contents = obj.get()["Body"].read()
# print the occurrences of the new line character to get the number of lines
print file_contents.count('\n')
Run Code Online (Sandbox Code Playgroud)
如果要对存储桶中的所有对象执行此操作,则可以使用以下代码段:
bucket = s3.Bucket('bucket_name')
for obj in bucket.objects.all():
file_contents = obj.get()["Body"].read()
print file_contents.count('\n')
Run Code Online (Sandbox Code Playgroud)
以下是boto3文档的参考,以获取更多功能:http ://boto3.readthedocs.io/en/latest/reference/services/s3.html#object
更新:(使用boto 2)
import boto
s3 = boto.connect_s3() # establish connection
bucket = s3.get_bucket('bucket_name') # get bucket
for key in bucket.list(prefix='key'): # list objects at a given prefix
file_contents = key.get_contents_as_string() # get file contents
print file_contents.count('\n') # print the occurrences of the new line character to get the number of lines
Run Code Online (Sandbox Code Playgroud)