我的主要工作只是读取操作而另一个人写了一些但MyISAM engine忽略了事务,所以我不需要事务支持.如何配置Spring Batch拥有自己的数据源JobRepository,与持有业务数据的数据源分开?最初的一个数据源配置如下所示:
@Configuration
public class StandaloneInfrastructureConfiguration {
@Autowired
Environment env;
@Bean
public LocalContainerEntityManagerFactoryBean entityManagerFactory() {
LocalContainerEntityManagerFactoryBean em = new LocalContainerEntityManagerFactoryBean();
em.setDataSource(dataSource());
em.setPackagesToScan(new String[] { "org.podcastpedia.batch.*" });
JpaVendorAdapter vendorAdapter = new HibernateJpaVendorAdapter();
em.setJpaVendorAdapter(vendorAdapter);
em.setJpaProperties(additionalJpaProperties());
return em;
}
Properties additionalJpaProperties() {
Properties properties = new Properties();
properties.setProperty("hibernate.hbm2ddl.auto", "none");
properties.setProperty("hibernate.dialect", "org.hibernate.dialect.MySQL5Dialect");
properties.setProperty("hibernate.show_sql", "true");
return properties;
}
@Bean
public DataSource dataSource(){
return DataSourceBuilder.create()
.url(env.getProperty("db.url"))
.driverClassName(env.getProperty("db.driver"))
.username(env.getProperty("db.username"))
.password(env.getProperty("db.password"))
.build();
}
@Bean
public PlatformTransactionManager transactionManager(EntityManagerFactory emf){
JpaTransactionManager transactionManager = …Run Code Online (Sandbox Code Playgroud) 我正在编写一个Spring批处理,我正在读取一大块数据,处理它然后我希望将这些数据传递给2个编写器.一个编写器只是更新数据库,而第二个编写器将写入csv文件.
我打算编写自己的自定义编写器并在customItemWriter中注入两个itemWriters,并在customItemWriter的write方法中调用两个item编写器的write方法.这种方法是否正确?是否有任何符合我要求的ItemWriter实现?
提前致谢
我是Spring Batch的新手.任何人都可以帮我解释Spring批处理中Step,Tasklet和Chunk之间的区别.另外,我还有一个疑问,如果我们想要在春季批次中运行一些并行的步骤.
我正在使用Spring Batch版本2.2.4.RELEASE我试着用有状态的ItemReader,ItemProcessor和ItemWriter bean编写一个简单的例子.
public class StatefulItemReader implements ItemReader<String> {
private List<String> list;
@BeforeStep
public void initializeState(StepExecution stepExecution) {
this.list = new ArrayList<>();
}
@AfterStep
public ExitStatus exploitState(StepExecution stepExecution) {
System.out.println("******************************");
System.out.println(" READING RESULTS : " + list.size());
return stepExecution.getExitStatus();
}
@Override
public String read() throws Exception {
this.list.add("some stateful reading information");
if (list.size() < 10) {
return "value " + list.size();
}
return null;
}
}
Run Code Online (Sandbox Code Playgroud)
在我的集成测试中,我在一个内部静态java配置类中声明我的bean,如下所示:
@ContextConfiguration
@RunWith(SpringJUnit4ClassRunner.class)
public class SingletonScopedTest {
@Configuration
@EnableBatchProcessing
static class TestConfig {
@Autowired …Run Code Online (Sandbox Code Playgroud) 我使用Spring启动编写了一个Spring批处理应用程序.当我尝试在本地系统上使用命令行和类路径运行该应用程序时,它运行正常.但是,当我试图在Linux服务器上运行它时,它给了我以下异常
Unable to start web server; nested exception is
org.springframework.context.ApplicationContextException:
Unable to start ServletWebServerApplicationContext due to missing ServletWebServerFactory bean.
Run Code Online (Sandbox Code Playgroud)
以下是我运行它的方式:
java -cp jarFileName.jar; lib\* -Dlogging.level.org.springframework=DEBUG -Dspring.profiles.active=dev -Dspring.batch.job.names=abcBatchJob com.aa.bb.StartSpringBatch > somelogs.log
Run Code Online (Sandbox Code Playgroud) 我知道spring批处理框架以块的形式处理数据.但是,我在想,当通过java实现相同的分块功能时,为什么我们需要使用批处理框架.
有没有人可以告诉我,如果有更多的理由去春季批处理框架?
我一直在尝试apache火花.我的问题更具体地触发火花工作.在这里,我发布了关于理解火花工作的问题.在弄脏工作后,我转向了我的要求.
我有一个REST端点,我公开API来触发Jobs,我使用Spring4.0进行Rest实现.现在继续我想在Spring中实现Job as Service,我将以编程方式提交Job,这意味着当触发端点时,使用给定的参数我将触发该作业.我现在有很少的设计选择.
类似于下面的写作,我需要维护几个由抽象类调用的作业JobScheduler.
/*Can this Code be abstracted from the application and written as
as a seperate job. Because my understanding is that the
Application code itself has to have the addJars embedded
which internally sparkContext takes care.*/
SparkConf sparkConf = new SparkConf().setAppName("MyApp").setJars(
new String[] { "/path/to/jar/submit/cluster" })
.setMaster("/url/of/master/node");
sparkConf.setSparkHome("/path/to/spark/");
sparkConf.set("spark.scheduler.mode", "FAIR");
JavaSparkContext sc = new JavaSparkContext(sparkConf);
sc.setLocalProperty("spark.scheduler.pool", "test");
// Application with Algorithm , transformations
Run Code Online (Sandbox Code Playgroud)扩展到上面有服务处理的多个版本的作业.
或者使用Spark Job Server来执行此操作.
首先,我想知道在这种情况下最佳解决方案是什么,执行方式和扩展方式.
注意:我正在使用来自spark的独立群集.善意的帮助.
rest job-scheduling spring-batch apache-spark spring-data-hadoop
我们正在尝试将Spring-Batch作业从XML配置转换为Java配置.我们使用的是Spring 4.0.1.RELEASE和Spring Batch 2.2.1.RELEASE.
转换一个作业后,以下警告开始出现在日志文件中:
2014年4月15日09:59:26.335 [Thread-2] WARN osbfsDisposableBeanAdapter - 在名为'fileReader'的bean上调用destroy方法'close'失败:org.springframework.batch.item.ItemStreamException:关闭项目阅读器时出错
完整的堆栈跟踪是:
org.springframework.batch.item.ItemStreamException: Error while closing item reader
at org.springframework.batch.item.support.AbstractItemCountingItemStreamItemReader.close(AbstractItemCountingItemStreamItemReader.java:131) ~[spring-batch-infrastructure-2.2.1.RELEASE.jar:na]
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[na:1.6.0_25]
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) ~[na:1.6.0_25]
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) ~[na:1.6.0_25]
at java.lang.reflect.Method.invoke(Method.java:597) ~[na:1.6.0_25]
at org.springframework.beans.factory.support.DisposableBeanAdapter.invokeCustomDestroyMethod(DisposableBeanAdapter.java:349) [spring-beans-4.0.1.RELEASE.jar:4.0.1.RELEASE]
at org.springframework.beans.factory.support.DisposableBeanAdapter.destroy(DisposableBeanAdapter.java:272) [spring-beans-4.0.1.RELEASE.jar:4.0.1.RELEASE]
at org.springframework.beans.factory.support.DefaultSingletonBeanRegistry.destroyBean(DefaultSingletonBeanRegistry.java:540) [spring-beans-4.0.1.RELEASE.jar:4.0.1.RELEASE]
at org.springframework.beans.factory.support.DefaultSingletonBeanRegistry.destroySingleton(DefaultSingletonBeanRegistry.java:516) [spring-beans-4.0.1.RELEASE.jar:4.0.1.RELEASE]
at org.springframework.beans.factory.support.DefaultListableBeanFactory.destroySingleton(DefaultListableBeanFactory.java:824) [spring-beans-4.0.1.RELEASE.jar:4.0.1.RELEASE]
at org.springframework.beans.factory.support.DefaultSingletonBeanRegistry.destroySingletons(DefaultSingletonBeanRegistry.java:485) [spring-beans-4.0.1.RELEASE.jar:4.0.1.RELEASE]
at org.springframework.context.support.AbstractApplicationContext.destroyBeans(AbstractApplicationContext.java:921) [spring-context-4.0.1.RELEASE.jar:4.0.1.RELEASE]
at org.springframework.context.support.AbstractApplicationContext.doClose(AbstractApplicationContext.java:895) [spring-context-4.0.1.RELEASE.jar:4.0.1.RELEASE]
at org.springframework.context.support.AbstractApplicationContext$1.run(AbstractApplicationContext.java:809) [spring-context-4.0.1.RELEASE.jar:4.0.1.RELEASE]
Caused by: java.lang.IllegalStateException: EntityManager is closed
at org.hibernate.ejb.EntityManagerImpl.close(EntityManagerImpl.java:132) ~[hibernate-entitymanager-4.2.5.Final.jar:4.2.5.Final]
at sun.reflect.GeneratedMethodAccessor14.invoke(Unknown Source) ~[na:na]
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) ~[na:1.6.0_25]
at java.lang.reflect.Method.invoke(Method.java:597) ~[na:1.6.0_25] …Run Code Online (Sandbox Code Playgroud) 我有一个要求,其中一个tasklet,存储在arraylist目录中的所有文件.列表的大小存储在作业执行上下文中.稍后,在另一个步骤中从另一个tasklet访问此计数.怎么做到这一点.我试图存储在jobexecution上下文中,在运行时抛出不可修改的集合异常,
public RepeatStatus execute(StepContribution arg0, ChunkContext arg1)
throws Exception {
StepContext stepContext = arg1.getStepContext();
StepExecution stepExecution = stepContext.getStepExecution();
JobExecution jobExecution = stepExecution.getJobExecution();
ExecutionContext jobContext = jobExecution.getExecutionContext();
jobContext.put("FILE_COUNT",150000);
Run Code Online (Sandbox Code Playgroud)
还将stepexection引用存储在beforestep注释中.但仍然不可能.请让我知道,如何在两个tasklet之间共享数据.
我正在使用带有石英的弹簧批2.2.4在某个时间运行一些工作
问题是作业总是在第一次执行代码后运行,然后根据计划的时间运行.我想停止第一次运行,让它只根据预定的时间运行.
我的cron表达式是"0 0 0**?" 我也试过"0 0 0 1/1*?*",但它仍然会在应用程序启动时执行一次
如何在应用程序启动时停止第一次执行?
这是作业上下文文件:
<batch:job id="exceptionLogJob">
<batch:step id="exceptionLogReadWriteStep">
<batch:tasklet >
<batch:chunk reader="exceptionLogReader" writer="exceptionLogWriter"
commit-interval="1000" />
</batch:tasklet>
</batch:step>
</batch:job>
<!-- ======================================================= -->
<!-- READER -->
<!-- ======================================================= -->
<bean id="exceptionLogReader"
class="org.springframework.batch.item.database.JdbcCursorItemReader">
<property name="dataSource" ref="dataSource" />
<property name="sql" value="SELECT a.*,a.rowid FROM SF_EXCEPTION_LOG a WHERE DATETIME > SYSDATE - 1" />
<property name="rowMapper" ref="ExceptionLogRowMapper" />
</bean>
<!-- ======================================================= -->
<!-- Writer -->
<!-- ======================================================= -->
<bean id="exceptionLogWriter"
class="com.mobily.sf.batchprocessor.exceptionlog.ExceptionLogWriter" />
<bean id="jobDetailExceptionLog" class="org.springframework.scheduling.quartz.JobDetailBean">
<property name="jobClass"
value="com.sf.batchprocessor.commons.JobLauncherDetails" /> …Run Code Online (Sandbox Code Playgroud)