Spring批量跳过记录

spj*_*the 7 java spring spring-batch

我有一个弹簧批处理(3.0.4.RELEASE)作业配置为索引solr中的一些记录.我没有遇到任何异常,并且作业响应成功完成,但有时它只处理数据库中大约一半的记录.我已经配置了一个JdbcCursorItemReader自定义编写器,如下所示.

<batch:job-repository id="jobRepository"
                      data-source="dataSource"
                      transaction-manager="transactionManager"
                      isolation-level-for-create="DEFAULT"
                      table-prefix="BATCH_"
                      max-varchar-length="1000"/>

<bean id="myItemReader" class="org.springframework.batch.item.database.JdbcCursorItemReader" scope="step">
    <property name="dataSource" ref="dataSource"/>
    <property name="sql" value="SELECT id FROM my_item ORDER BY id asc"/>
    <property name="rowMapper" ref="myItemIdRowMapper"/>
</bean>

<bean id="taskExecutor" class="org.springframework.core.task.SimpleAsyncTaskExecutor">
    <property name="concurrencyLimit" value="4"/>
</bean>

<bean id="jobLauncher" class="org.springframework.batch.core.launch.support.SimpleJobLauncher">
    <property name="jobRepository" ref="jobRepository"/>
    <property name="taskExecutor" ref="taskExecutor"/>
</bean>

<batch:job id="myItemIndexJob" restartable="true" job-repository="jobRepository">
    <batch:listeners>
        <batch:listener ref="executionListener"/>
    </batch:listeners>
    <batch:step id="myItemSolrIndexStep" allow-start-if-complete="true">
        <batch:tasklet>
            <batch:chunk reader="myItemReader" writer="myItemSolrWriter" commit-interval="50"/>
        </batch:tasklet>
    </batch:step>
</batch:job>

Run Code Online (Sandbox Code Playgroud)

行映射器只返回查询中的id

public final class MyItemRowMapper implements RowMapper<Integer> {

    @Override
    public Integer mapRow(ResultSet rs, int rowNum) throws SQLException {
        return rs.getInt("id");
    }

}

Run Code Online (Sandbox Code Playgroud)

并且作者在solr中提交索引服务.作者不修改MyItem.

public class MyItemSolrWriter implements ItemWriter<Integer> {

    @Override
    public void write(List<? extends Integer> myItemIds) throws Exception {
        service.index(Lists.newArrayList(myItemIds));
    }
}

Run Code Online (Sandbox Code Playgroud)

这是step_execution作业运行的数据库条目,表示它没有跳过任何内容并且只读取279086行

|     step_name            | start_time              | end_time                | status    | read_count | filter_count | write_count | read_skip_count | write_skip_count | exit_message |
| myItemSolrIndexStep      | 2016-05-12 10:07:01.994 | 2016-05-12 10:09:07.303 | COMPLETED | 279086     | 0            | 279086      | 0               | 0                |              |

Run Code Online (Sandbox Code Playgroud)

并且step_execution_context数据库条目指示读者已读取573937行

{"map":[{"entry":[{"string":"JdbcCursorItemReader.read.count","int":573937},{"string":["batch.taskletType","org.springframework.batch.core.step.item.ChunkOrientedTasklet"]},{"string":["batch.stepType","org.springframework.batch.core.step.tasklet.TaskletStep"]}]}]}

Run Code Online (Sandbox Code Playgroud)

后续运行会导致573937正在处理完整记录,因此这只会间歇性地发生.可能导致它的任何想法,或调试问题的方法？

如有必要,我可以提供更多细节

归档时间：	10 年前
查看次数：	939 次
最近记录：	10 年前