我有一些文件正在上传到S3并处理一些Redshift任务.完成该任务后,需要合并这些文件.目前我正在删除这些文件并再次上传合并文件.这些占用了大量带宽.有没有办法直接在S3上合并文件?
我正在使用Apache Camel进行路由.
我正在使用AWS-S3使用者定期轮询S3上某个位置的文件.轮询一定时间之后,它会因为给出的异常而失败
Will try again at next poll. Caused by:[com.amazonaws.AmazonClientException - Unable to execute HTTP request:
Timeout waiting for connection from pool]
com.amazonaws.AmazonClientException: Unable to execute HTTP request:Timeout waiting for connection from pool
at com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:376) ~[aws-java-sdk-1.5.5.jar:na]
at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:202) ~[aws-java-sdk-1.5.5.jar:na]
at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:3037) ~[aws-java-sdk-1.5.5.jar:na]
at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:3008) ~[aws-java-sdk-1.5.5.jar:na]
at com.amazonaws.services.s3.AmazonS3Client.listObjects(AmazonS3Client.java:531) ~[aws-java-sdk-1.5.5.jar:na]
at org.apache.camel.component.aws.s3.S3Consumer.poll(S3Consumer.java:69) ~[camel-aws-2.12.0.jar:2.12.0]
at org.apache.camel.impl.ScheduledPollConsumer.doRun(ScheduledPollConsumer.java:187) [camel-core-2.12.0.jar:2.12.0]
at org.apache.camel.impl.ScheduledPollConsumer.run(ScheduledPollConsumer.java:114) [camel-core-2.12.0.jar:2.12.0]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) [na:1.7.0_60]
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304) [na:1.7.0_60]
Run Code Online (Sandbox Code Playgroud)
根据我的理解,原因是消费者耗尽了池中的可用连接,因为它在每次轮询时使用新连接.我需要知道的是如何在每次轮询后释放资源以及为什么组件本身不会这样做.
骆驼版:2.12
编辑:我修改了使用者以选择具有特定连接超时,maxconnections,maxerrorretry和sockettimeout的自定义S3客户端,但没有用.结果是一样的.
S3客户端配置:
ClientConfiguration clientConfiguration = new ClientConfiguration();
clientConfiguration.setMaxConnections(50);
clientConfiguration.setConnectionTimeout(6000);
clientConfiguration.setMaxErrorRetry(3);
clientConfiguration.setSocketTimeout(30000);
main.bind("s3Client", new AmazonS3Client(awsCredentials, clientConfiguration));
Run Code Online (Sandbox Code Playgroud)
名为"s3Client"的AmazonS3Client对象与Camel上下文绑定,并提供给基于Camel …
我有一个有"to"条款的路线.如果发生一些异常,我已经使用try catch块将交换重定向到路由.我遇到的例外情况与客户端允许的最大并行连接数无关.似乎,当异常得到解决时,交换的每次重试都会在离开的地方进一步处理.我怎样才能结束有例外的路线.
以下是我的代码.
from("direct:hourlyFeedParts")
.routeId("appnexus hourly downloader")
.doTry()
.process(AppNexusProcessor.getDownloadProcessor())
.process(AppNexusProcessor.getNamingProcessor())
.id("Appnexus Feed Downloader")
.log("Downloading file ${file:name}")
.to("{{appnexus.partsDestination}}")
.log("Downloaded file ${file:name} to local")
.doCatch(Exception.class)
.to("direct:hourlyFeedParts")
.end()
.bean(AppNexusProcessor.class, "updateIdempotentList")
.choice()
.when(simple("${property.CamelSplitComplete} == true"))
.split(beanExpression(AppNexusProcessor.class, "getAggregatorProcessor"))
.to("direct:S3PreProcessor")
.endChoice()
.end();
Run Code Online (Sandbox Code Playgroud)
我以为可能正在使用endParent()之后
.doCatch(Exception.class)
.to("direct:hourlyFeedParts")
.endParent()
Run Code Online (Sandbox Code Playgroud)
这是正确的方法吗?我无法理解文档中endParent()的确切用法.