rev*_*rse 5 spring spring-batch
我有一个简单的 Spring Batch,它从文件中读取 100 万条记录并将其打印在控制台上。
现在,我想在 N 个服务器上部署这一批,假设 N=5。
如何确保所有服务器实例都不会读取相同的记录?
如 - 我如何适当地分割文件中的记录(100万/5)以实现优化结果?
请帮忙提供代码示例。谢谢。
正如迈克尔所建议的,您可以使用系统命令分割文件,然后用于并行MultiResourcePartitioner处理分割的文件。我就是这样做的
@Bean
public Partitioner partitioner() {
MultiResourcePartitioner partitioner = new MultiResourcePartitioner();
ClassLoader cl = this.getClass().getClassLoader();
ResourcePatternResolver resolver = new PathMatchingResourcePatternResolver(cl);
Resource[] resources = resolver.getResources("file:" + filePath + "/"+"*.csv");
partitioner.setResources(resources);
partitioner.partition(10);
return partitioner;
}
@Bean
public TaskExecutor taskExecutor() {
ThreadPoolTaskExecutor taskExecutor = new ThreadPoolTaskExecutor();
taskExecutor.setMaxPoolSize(4);
taskExecutor.afterPropertiesSet();
return taskExecutor;
}
@Bean
@Qualifier("masterStep")
public Step masterStep() {
return stepBuilderFactory.get("masterStep")
.partitioner(ProcessDataStep())
.partitioner("ProcessDataStep",partitioner())
.taskExecutor(taskExecutor())
.listener(pcStressStepListener)
.build();
}
@Bean
@Qualifier("processData")
public Step processData() {
return stepBuilderFactory.get("processData")
.<pojo, pojo> chunk(5000)
.reader(reader)
.processor(processor())
.writer(writer)
.build();
}
@Bean(name="reader")
@StepScope
public FlatFileItemReader<pojo> reader(@Value("#{stepExecutionContext['fileName']}") String filename) {
FlatFileItemReader<pojo> reader = new FlatFileItemReader<>();
reader.setResource(new UrlResource(filename));
reader.setLineMapper(new DefaultLineMapper<pojo>() {
{
setLineTokenizer(new DelimitedLineTokenizer() {
{
setNames(FILE HEADER);
}
});
setFieldSetMapper(new BeanWrapperFieldSetMapper<pojo>() {
{
setTargetType(pojo.class);
}
});
}
});
return reader;
}
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
2393 次 |
| 最近记录: |