使用Java获取S3存储桶中所有项目列表的最简单方法是什么?
List<S3ObjectSummary> s3objects = s3.listObjects(bucketName,prefix).getObjectSummaries();
Run Code Online (Sandbox Code Playgroud)
此示例仅返回1000个项目.
Ron*_* D. 92
这可能是一种解决方法,但这解决了我的问题:
ObjectListing listing = s3.listObjects( bucketName, prefix );
List<S3ObjectSummary> summaries = listing.getObjectSummaries();
while (listing.isTruncated()) {
listing = s3.listNextBatchOfObjects (listing);
summaries.addAll (listing.getObjectSummaries());
}
Run Code Online (Sandbox Code Playgroud)
小智 18
这直接来自AWS文档:
AmazonS3 s3client = new AmazonS3Client(new ProfileCredentialsProvider());
ListObjectsRequest listObjectsRequest = new ListObjectsRequest()
.withBucketName(bucketName)
.withPrefix("m");
ObjectListing objectListing;
do {
objectListing = s3client.listObjects(listObjectsRequest);
for (S3ObjectSummary objectSummary :
objectListing.getObjectSummaries()) {
System.out.println( " - " + objectSummary.getKey() + " " +
"(size = " + objectSummary.getSize() +
")");
}
listObjectsRequest.setMarker(objectListing.getNextMarker());
} while (objectListing.isTruncated());
Run Code Online (Sandbox Code Playgroud)
小智 10
我正在处理由我们的系统生成的大量对象; 我们更改了存储数据的格式,需要检查每个文件,确定哪些文件采用旧格式,然后进行转换.还有其他方法可以做到这一点,但这个方法与你的问题有关.
ObjectListing list = amazonS3Client.listObjects(contentBucketName, contentKeyPrefix);
do {
List<S3ObjectSummary> summaries = list.getObjectSummaries();
for (S3ObjectSummary summary : summaries) {
String summaryKey = summary.getKey();
/* Retrieve object */
/* Process it */
}
list = amazonS3Client.listNextBatchOfObjects(list);
}while (list.isTruncated());
Run Code Online (Sandbox Code Playgroud)
小智 8
使用AWS SDK for Java列出密钥
http://docs.aws.amazon.com/AmazonS3/latest/dev/ListingObjectKeysUsingJava.html
import java.io.IOException;
import com.amazonaws.AmazonClientException;
import com.amazonaws.AmazonServiceException;
import com.amazonaws.auth.profile.ProfileCredentialsProvider;
import com.amazonaws.services.s3.AmazonS3;
import com.amazonaws.services.s3.AmazonS3Client;
import com.amazonaws.services.s3.model.ListObjectsRequest;
import com.amazonaws.services.s3.model.ListObjectsV2Request;
import com.amazonaws.services.s3.model.ListObjectsV2Result;
import com.amazonaws.services.s3.model.ObjectListing;
import com.amazonaws.services.s3.model.S3ObjectSummary;
public class ListKeys {
private static String bucketName = "***bucket name***";
public static void main(String[] args) throws IOException {
AmazonS3 s3client = new AmazonS3Client(new ProfileCredentialsProvider());
try {
System.out.println("Listing objects");
final ListObjectsV2Request req = new ListObjectsV2Request().withBucketName(bucketName);
ListObjectsV2Result result;
do {
result = s3client.listObjectsV2(req);
for (S3ObjectSummary objectSummary :
result.getObjectSummaries()) {
System.out.println(" - " + objectSummary.getKey() + " " +
"(size = " + objectSummary.getSize() +
")");
}
System.out.println("Next Continuation Token : " + result.getNextContinuationToken());
req.setContinuationToken(result.getNextContinuationToken());
} while(result.isTruncated() == true );
} catch (AmazonServiceException ase) {
System.out.println("Caught an AmazonServiceException, " +
"which means your request made it " +
"to Amazon S3, but was rejected with an error response " +
"for some reason.");
System.out.println("Error Message: " + ase.getMessage());
System.out.println("HTTP Status Code: " + ase.getStatusCode());
System.out.println("AWS Error Code: " + ase.getErrorCode());
System.out.println("Error Type: " + ase.getErrorType());
System.out.println("Request ID: " + ase.getRequestId());
} catch (AmazonClientException ace) {
System.out.println("Caught an AmazonClientException, " +
"which means the client encountered " +
"an internal error while trying to communicate" +
" with S3, " +
"such as not being able to access the network.");
System.out.println("Error Message: " + ace.getMessage());
}
}
}
Run Code Online (Sandbox Code Playgroud)
对于那些在2018年以上阅读此书的人。有两种新的无分页烦琐的API:一种在适用于Java 1.x的AWS开发工具包中,另一种在2.x中。
Java SDK中有一个新的API,可让您遍历S3存储桶中的对象而无需处理分页:
AmazonS3 s3 = AmazonS3ClientBuilder.standard().build();
S3Objects.inBucket(s3, "the-bucket").forEach((S3ObjectSummary objectSummary) -> {
// TODO: Consume `objectSummary` the way you need
System.out.println(objectSummary.key);
});
Run Code Online (Sandbox Code Playgroud)
此迭代很懒:
S3ObjectSummarys 的列表将根据需要懒惰地获取,一次一页。页面的大小可以通过该withBatchSize(int)方法控制。
API已更改,因此这是SDK 2.x版本:
S3Client client = S3Client.builder().region(Region.US_EAST_1).build();
ListObjectsV2Request request = ListObjectsV2Request.builder().bucket("the-bucket").prefix("the-prefix").build();
ListObjectsV2Iterable response = client.listObjectsV2Paginator(request);
for (ListObjectsV2Response page : response) {
page.contents().forEach((S3Object object) -> {
// TODO: Consume `object` the way you need
System.out.println(object.key());
});
}
Run Code Online (Sandbox Code Playgroud)
调用该操作时,将返回此类的实例。此时,尚未进行任何服务调用,因此无法保证该请求有效。在迭代过程中,SDK将通过进行服务调用来延迟加载响应页面,直到没有页面可用或您的迭代停止为止。如果您的请求中有错误,则只有在开始遍历可迭代对象之后,您才会看到失败。
作为一个稍微更简洁的解决方案,列出可能被截断的S3对象:
ListObjectsRequest request = new ListObjectsRequest().withBucketName(bucketName);
ObjectListing listing = null;
while((listing == null) || (request.getMarker() != null)) {
listing = s3Client.listObjects(request);
// do stuff with listing
request.setMarker(listing.getNextMarker());
}
Run Code Online (Sandbox Code Playgroud)
小智 5
格雷你的解决方案很奇怪,但你看起来是个好人。
AmazonS3Client s3Client = new AmazonS3Client(new BasicAWSCredentials( ....
ObjectListing images = s3Client.listObjects(bucketName);
List<S3ObjectSummary> list = images.getObjectSummaries();
for(S3ObjectSummary image: list) {
S3Object obj = s3Client.getObject(bucketName, image.getKey());
writeToFile(obj.getObjectContent());
}
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
76315 次 |
| 最近记录: |