为什么 Spring Batch 为每个线程使用 1 个数据库连接?

sha*_*son 2 java multithreading spring-batch spring-boot hikaricp

为什么 Spring Batch 为每个线程使用 1 个数据库连接?

堆:

  • 爪哇 8
  • 弹簧靴 1.5
  • 春季批次 3.0.7
  • HikariCP 2.7.6

数据源配置:

  • batcdb (postgres)
  • 读取数据库(甲骨文)
  • 写数据库(postgres)

每个数据源都使用 HikariCP,每个数据源默认有 10 个连接。

Spring Batch 配置:ThreadExecutor-1:

core-pool-size: 10
max-pool-size: 10
throttle-limit: 10
Run Code Online (Sandbox Code Playgroud)

Job-1 Config / ThreadPoolTask​​Executor:(通过 application.yml 设置的池大小和节流限制)

@Bean
public Step job1Step() {
    return stepBuilderFactory.get("job1Step")
            .<ReadModel, WriteModel>chunk(chunkSize)
            .reader(itemReader())
            .processor(compositeProcessor())
            .writer(itemWriter())
            .faultTolerant()
            .taskExecutor(job1TaskExecutor())
            .throttleLimit(throttleLimit)
            .build();
}

@Bean
public ThreadPoolTaskExecutor job1TaskExecutor() {
     ThreadPoolTaskExecutor pool = new ThreadPoolTaskExecutor();
     pool.setCorePoolSize(poolSize);
     pool.setMaxPoolSize(maxPoolSize);
     pool.setWaitForTasksToCompleteOnShutdown(false);
     return pool;
 }

@Bean
@StepScope
public Job1ItemReader job1ItemReader() {
    return new Job1ItemReader(readdb, pageSize);
}
Run Code Online (Sandbox Code Playgroud)

Job1-ItemReader 的缩写代码

public class Job1ItemReader extends JdbcPagingItemReader<ReadModel> {
...
}
Run Code Online (Sandbox Code Playgroud)

ThreadExecutor-2:

core-pool-size: 5
max-pool-size: 5
throttle-limit: 5
Run Code Online (Sandbox Code Playgroud)

作业 2 配置/线程池任务执行器:

@Bean
public Step job2Step() throws Exception {
    return stepBuilderFactory.get("job2Step")
            .<ReadModel2, WriteModel2>chunk(chunkSize)
            .reader(job2ItemReader())
            .processor(job2CompositeProcessor())
            .writer(job2ItemWriter())
            .faultTolerant()
            .taskExecutor(job2TaskExecutor())
            .throttleLimit(throttleLimit)
            .build();
}

@Bean
public ThreadPoolTaskExecutor job2TaskExecutor() {
    ThreadPoolTaskExecutor pool = new ThreadPoolTaskExecutor();
    pool.setCorePoolSize(corePoolSize);
    pool.setMaxPoolSize(maxPoolSize);
    pool.setQueueCapacity(queueCapacity);
    pool.setWaitForTasksToCompleteOnShutdown(false);
    return pool;
}

@Bean
@StepScope
public Job2ItemReader job2ItemReader() {
    return new Job2ItemReader(readdb, pageSize);    
}
Run Code Online (Sandbox Code Playgroud)

Job2-ItemReader 的缩写代码

public class Job2ItemReader extends JdbcPagingItemReader<ReadModel2> {
...
}
Run Code Online (Sandbox Code Playgroud)
  • 有2个职位
  • Job-1 是长时间运行的(多天)
  • Job-2 通常在一两个小时内完成,并且每天按计划运行
  • 作业在同一个“应用程序”中,在同一个 JVM 上运行
  • 每个 Job 都有自己定义的 ThreadPoolTask​​Executor

当 Job-1 正在运行且 Job-2 启动时,Job-2 无法连接到readdb. Job-2 的 Batch Reader 抛出以下错误。

Caused by: org.springframework.jdbc.support.MetaDataAccessException: Could not get Connection for extracting meta data; nested exception is org.springframework.jdbc.CannotGetJdbcConnectionException: Could not get JDBC Connection; nested exception is java.sql.SQLTransientConnectionException: HikariPool-3 - Connection is not available, request timed out after 30000ms.
at org.springframework.jdbc.support.JdbcUtils.extractDatabaseMetaData(JdbcUtils.java:339)
at org.springframework.jdbc.support.JdbcUtils.extractDatabaseMetaData(JdbcUtils.java:366)
at org.springframework.batch.support.DatabaseType.fromMetaData(DatabaseType.java:97)
at org.springframework.batch.item.database.support.SqlPagingQueryProviderFactoryBean.getObject(SqlPagingQueryProviderFactoryBean.java:158)
... 30 common frames omitted
Caused by: org.springframework.jdbc.CannotGetJdbcConnectionException: Could not get JDBC Connection; nested exception is java.sql.SQLTransientConnectionException: HikariPool-3 - Connection is not available, request timed out after 30000ms.
at org.springframework.jdbc.datasource.DataSourceUtils.getConnection(DataSourceUtils.java:80)
at org.springframework.jdbc.support.JdbcUtils.extractDatabaseMetaData(JdbcUtils.java:326)
... 33 common frames omitted
Caused by: java.sql.SQLTransientConnectionException: HikariPool-3 - Connection is not available, request timed out after 30000ms.
at com.zaxxer.hikari.pool.HikariPool.createTimeoutException(HikariPool.java:666)
at com.zaxxer.hikari.pool.HikariPool.getConnection(HikariPool.java:182)
at com.zaxxer.hikari.pool.HikariPool.getConnection(HikariPool.java:147)
at com.zaxxer.hikari.HikariDataSource.getConnection(HikariDataSource.java:123)
at org.springframework.jdbc.datasource.DataSourceUtils.doGetConnection(DataSourceUtils.java:111)
at org.springframework.jdbc.datasource.DataSourceUtils.getConnection(DataSourceUtils.java:77)
Run Code Online (Sandbox Code Playgroud)

(为保护无辜者而编辑)

参考:

Mic*_*lla 5

Spring Batch 的每个线程使用一个数据库连接的原因(在某些情况下它实际上可以使用更多),是由于事务。Spring 事务绑定到一个线程。Spring Batch 中的几乎所有事情都发生在一个事务中。因此,当您有一个带有单个线程的作业时,您最多只能使用几个连接。但是,如果您有一个多线程步骤,那么每个线程至少需要一个连接来处理事务。