Ale*_*sky 11 java amazon-s3 amazon-web-services
我有一个帮助程序,试图从S3进行线程下载.很多时候(大约1%的请求)我得到一条关于a的日志消息,一段NoHttpResponseException时间之后导致SocketTimeoutException从中读取S3ObjectInputStream.
我做错了什么,还是仅仅是我的路由器/互联网?或者这是S3的预期?我没有注意到其他地方的问题.
public void
fastRead(final String key, Path path) throws StorageException
{
final int pieceSize = 1<<20;
final int threadCount = 8;
try (FileChannel channel = (FileChannel) Files.newByteChannel( path, WRITE, CREATE, TRUNCATE_EXISTING ))
{
final long size = s3.getObjectMetadata(bucket, key).getContentLength();
final long pieceCount = (size - 1) / pieceSize + 1;
ThreadPool pool = new ThreadPool (threadCount);
final AtomicInteger progress = new AtomicInteger();
for(int i = 0; i < size; i += pieceSize)
{
final int start = i;
final long end = Math.min(i + pieceSize, size);
pool.submit(() ->
{
boolean retry;
do
{
retry = false;
try
{
GetObjectRequest request = new GetObjectRequest(bucket, key);
request.setRange(start, end - 1);
S3Object piece = s3.getObject(request);
ByteBuffer buffer = ByteBuffer.allocate ((int)(end - start));
try(InputStream stream = piece.getObjectContent())
{
IOUtils.readFully( stream, buffer.array() );
}
channel.write( buffer, start );
double percent = (double) progress.incrementAndGet() / pieceCount * 100.0;
System.err.printf("%.1f%%\n", percent);
}
catch(java.net.SocketTimeoutException | java.net.SocketException e)
{
System.err.println("Read timed out. Retrying...");
retry = true;
}
}
while (retry);
});
}
pool.<IOException>await();
}
catch(AmazonClientException | IOException | InterruptedException e)
{
throw new StorageException (e);
}
}
2014-05-28 08:49:58 INFO com.amazonaws.http.AmazonHttpClient executeHelper Unable to execute HTTP request: The target server failed to respond
org.apache.http.NoHttpResponseException: The target server failed to respond
at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:95)
at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:62)
at org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:254)
at org.apache.http.impl.AbstractHttpClientConnection.receiveResponseHeader(AbstractHttpClientConnection.java:289)
at org.apache.http.impl.conn.DefaultClientConnection.receiveResponseHeader(DefaultClientConnection.java:252)
at org.apache.http.impl.conn.ManagedClientConnectionImpl.receiveResponseHeader(ManagedClientConnectionImpl.java:191)
at org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:300)
at com.amazonaws.http.protocol.SdkHttpRequestExecutor.doReceiveResponse(SdkHttpRequestExecutor.java:66)
at org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:127)
at org.apache.http.impl.client.DefaultRequestDirector.tryExecute(DefaultRequestDirector.java:713)
at org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:518)
at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:906)
at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:805)
at com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:385)
at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:233)
at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:3569)
at com.amazonaws.services.s3.AmazonS3Client.getObject(AmazonS3Client.java:1130)
at com.syncwords.files.S3Storage.lambda$fastRead$0(S3Storage.java:123)
at com.syncwords.files.S3Storage$$Lambda$3/1397088232.run(Unknown Source)
at net.almson.util.ThreadPool.lambda$submit$8(ThreadPool.java:61)
at net.almson.util.ThreadPool$$Lambda$4/1980698753.call(Unknown Source)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:744)
Run Code Online (Sandbox Code Playgroud)
Ale*_*sky 14
更新:AWS SDK已经有更新,以响应我在GitHub上创建的问题.我不确定情况如何变化.这个答案的第二部分(批评getObject)可能(希望?)错了.
S3被设计为失败,并且经常失败.
幸运的是,AWS SDK for Java具有用于重试请求的内置工具.不幸的是,他们没有在下载S3对象时覆盖SocketExceptions的情况(它们在上传和执行其他操作时确实有效).因此,必须使用与问题类似的代码(见下文).
当机制按需运行时,您仍会在日志中看到消息.您可以选择通过过滤INFO日志事件来隐藏它们com.amazonaws.http.AmazonHttpClient.(AWS SDK使用Apache Commons Logging.)
根据您的网络连接和Amazon服务器的运行状况,重试机制可能会失败.正如lvlv所指出的,配置相关参数的方法是通过ClientConfiguration.我建议更改的参数是重试次数,默认情况下3.您可能尝试的其他事情是增加或减少连接和套接字超时(默认为50秒,这不仅足够长,考虑到您将要经常超时,无论如何都可能太长)并使用TCP KeepAlive(默认值)关闭).
ClientConfiguration cc = new ClientConfiguration()
.withMaxErrorRetry (10)
.withConnectionTimeout (10_000)
.withSocketTimeout (10_000)
.withTcpKeepAlive (true);
AmazonS3 s3Client = new AmazonS3Client (credentials, cc);
Run Code Online (Sandbox Code Playgroud)
甚至可以通过设置RetryPolicy(再次,在ClientConfiguration)中重写重试机制.它最有趣的元素是RetryCondition,默认情况下:
按以下顺序检查各种条件:
- 重试由IOException引起的AmazonClientException异常;
- 重试AmazonServiceException异常,这些异常包括500个内部服务器错误,503个服务不可用错误,服务限制错误或时钟偏差错误.
请参阅SDKDefaultRetryCondition javadoc和source
内置机制(在整个AWS SDK中使用)无法处理的是读取S3对象数据.
如果您调用,AmazonS3Client使用自己的重试机制AmazonS3.getObject (GetObjectRequest getObjectRequest, File destinationFile).机制在内部ServiceUtils.retryableDownloadS3ObjectToFile(源),它使用次优的硬连线重试行为(它只会重试一次,永远不会在SocketException上!).所有代码ServiceUtils看起来都很糟糕(问题).
我使用类似的代码:
public void
read(String key, Path path) throws StorageException
{
GetObjectRequest request = new GetObjectRequest (bucket, key);
for (int retries = 5; retries > 0; retries--)
try (S3Object s3Object = s3.getObject (request))
{
if (s3Object == null)
return; // occurs if we set GetObjectRequest constraints that aren't satisfied
try (OutputStream outputStream = Files.newOutputStream (path, WRITE, CREATE, TRUNCATE_EXISTING))
{
byte[] buffer = new byte [16_384];
int bytesRead;
while ((bytesRead = s3Object.getObjectContent().read (buffer)) > -1) {
outputStream.write (buffer, 0, bytesRead);
}
}
catch (SocketException | SocketTimeoutException e)
{
// We retry exceptions that happen during the actual download
// Errors that happen earlier are retried by AmazonHttpClient
try { Thread.sleep (1000); } catch (InterruptedException i) { throw new StorageException (i); }
log.log (Level.INFO, "Retrying...", e);
continue;
}
catch (IOException e)
{
// There must have been a filesystem problem
// We call `abort` to save bandwidth
s3Object.getObjectContent().abort();
throw new StorageException (e);
}
return; // Success
}
catch (AmazonClientException | IOException e)
{
// Either we couldn't connect to S3
// or AmazonHttpClient ran out of retries
// or s3Object.close() threw an exception
throw new StorageException (e);
}
throw new StorageException ("Ran out of retries.");
}
Run Code Online (Sandbox Code Playgroud)
我以前遇到过类似的问题.根据AWS S3的官方示例,我发现每次完成一个S3Object后,您需要关闭它()以将某些资源释放回池中:
AmazonS3 s3Client = new AmazonS3Client(new ProfileCredentialsProvider());
S3Object object = s3Client.getObject(
new GetObjectRequest(bucketName, key));
InputStream objectData = object.getObjectContent();
// Process the objectData stream.
objectData.close();
Run Code Online (Sandbox Code Playgroud)
感谢您添加链接.顺便说一句,我想增加ClientConfiguration的最大连接,重试和超时(默认情况下,最大连接数为50)也可能有助于解决问题,如下所示:
AmazonS3 s3Client = new AmazonS3Cient(aws_credential,
new ClientConfiguration().withMaxConnections(100)
.withConnectionTimeout(120 * 1000)
.withMaxErrorRetry(15))
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
8170 次 |
| 最近记录: |