如何从S3下载GZip文件?

ylu*_*.ca 7 java gzip amazon-s3 amazon-web-services

我查看了AWS S3 Java SDK - 下载文件帮助使用Java中的Zip和GZip文件.

虽然它们提供了分别从S3和GZipped文件下载和处理文件的方法,但这些方法无法处理位于S3中的GZipped文件.我该怎么做?

目前我有:

try {
    AmazonS3 s3Client = new AmazonS3Client(
            new ProfileCredentialsProvider());
    String URL = downloadURL.getPrimitiveJavaObject(arg0[0].get());
    S3Object fileObj = s3Client.getObject(getBucket(URL), getFile(URL));
    BufferedReader fileIn = new BufferedReader(new InputStreamReader(
            fileObj.getObjectContent()));
    String fileContent = "";
    String line = fileIn.readLine();
    while (line != null){
        fileContent += line + "\n";
        line = fileIn.readLine();
    }
    fileObj.close();
    return fileContent;
} catch (IOException e) {
    e.printStackTrace();
    return "ERROR IOEXCEPTION";
}
Run Code Online (Sandbox Code Playgroud)

显然,我没有处理文件的压缩性质,我的输出是:

????sU?3204?50?5010?20?24??L,(???O?V?M-.NLOU?R?U?????<s??<#?^?.w?X?%w?????????}C=?%?J3??.???????S?????ZQ?T?e??#sr?cdN#?:&?
S?B?J????P?<??
Run Code Online (Sandbox Code Playgroud)

但是,我无法在上面给出的第二个问题中实现该示例,因为该文件不在本地,它需要从S3下载.

我该怎么办?

ylu*_*.ca 7

我用一个Scanner而不是一个来解决这个问题InputStream.

扫描程序采用GZIPInputStream并逐行读取解压缩的文件:

fileObj = s3Client.getObject(new GetObjectRequest(oSummary.getBucketName(), oSummary.getKey()));
fileIn = new Scanner(new GZIPInputStream(fileObj.getObjectContent()));
Run Code Online (Sandbox Code Playgroud)


Ahm*_*rdi 6

您必须使用GZIPInputStream读取 GZIP 文件

       AmazonS3 s3Client = AmazonS3ClientBuilder.standard()
            .withCredentials(new ProfileCredentialsProvider())
            .build();
    String URL = downloadURL.getPrimitiveJavaObject(arg0[0].get());
    S3Object fileObj = s3Client.getObject(getBucket(URL), getFile(URL));

    byte[] buffer = new byte[1024];
    int n;
    FileOutputStream fileOuputStream = new FileOutputStream("temp.gz");
    BufferedInputStream bufferedInputStream = new BufferedInputStream( new GZIPInputStream(fileObj.getObjectContent()));

    GZIPOutputStream gzipOutputStream = new GZIPOutputStream(fileOuputStream);
    while ((n = bufferedInputStream.read(buffer)) != -1) {
        gzipOutputStream.write(buffer);
    }
    gzipOutputStream.flush();
    gzipOutputStream.close();
Run Code Online (Sandbox Code Playgroud)

请尝试这种方式从 S3 下载 GZip 文件。