同学们好。我开发了一个基于 sparkLauncher 的应用程序,它运行一个可执行的 jar,其中有 5 个操作。每个操作取决于特定的变量。我有一个主要的 hadoop 集群 spark2.3.0-hadoop2.6.5。它运作良好。我的工作代码的一部分:
private void runSparkJob(String pathToJar, final LocalDate startDate, final LocalDate endDate) {
if (executionInProgress.get()) {
LOGGER.warn("Execution already in progress");
return;
}
Process sparkProcess = null;
try {
LOGGER.info("Create SparkLauncher. SparkHome: [{}]. JarPath: [{}].", sparkHome, vmJarPath);
executionInProgress.set(true);
sparkProcess = new SparkLauncher()
.setAppName(activeOperationProfile)
.setSparkHome(sparkHome) //sparkHome folder on main cluster
.setAppResource(pathToJar) // jar with 5 operation
.setConf(SparkLauncher.DRIVER_EXTRA_JAVA_OPTIONS,
String.format("-Drunner.operation-profile=%1$s -Doperation.startDate=%2$s -Doperation.endDate=%3$s", activeOperationProfile, startDate,endDate))
.setConf(SparkLauncher.DRIVER_MEMORY, "12G")
.redirectToLog(LOGGER.getName())
.setMaster("yarn")
.launch();
sparkProcess.waitFor();
int exitCode = sparkProcess.exitValue();
if (exitCode != …Run Code Online (Sandbox Code Playgroud) 美好的一天,同事们。我在生产中使用 gitlab ci。我有很多阶段。1)构建artifactory 2)部署到外部服务器 3)使用jfrog cli部署到artifactory
我在缓存 Maven 本地存储库时遇到问题。我的跑步者在第一步(构建)中下载所有依赖项,并在最后一步(部署到神器)中执行相同的操作。另外,我的跑步者在最后阶段之前删除了 m2 文件夹中的所有数据:
Removing .m2/antlr/
Removing .m2/aopalliance/
Removing .m2/asm/
Removing .m2/avalon-framework/
Removing .m2/backport-util-concurrent/
Removing .m2/ch/
Removing .m2/classworlds/
Removing .m2/com/
Removing .m2/commons-beanutils/
Removing .m2/commons-chain/
Removing .m2/commons-cli/
Removing .m2/commons-codec/
Removing .m2/commons-collections/
Removing .m2/commons-digester/
Removing .m2/commons-io/
Removing .m2/commons-lang/
Removing .m2/commons-logging/
Removing .m2/commons-validator/
Removing .m2/dom4j/
Removing .m2/io/
Removing .m2/javax/
Removing .m2/junit/
Removing .m2/log4j/
Removing .m2/logkit/
Removing .m2/net/
Removing .m2/org/
Removing .m2/oro/
Removing .m2/sslext/
Removing .m2/xml-apis/
Removing .m2/xmlunit/
Removing jfrog
Removing target/
Run Code Online (Sandbox Code Playgroud)
我的 gitlav-ci yaml(没有第二步):
stages:
- …Run Code Online (Sandbox Code Playgroud) apache-spark ×1
artifactory ×1
docker ×1
gitlab ×1
gitlab-ci ×1
java ×1
jfrog-cli ×1
spring-boot ×1
yaml ×1