我正在使用Maven Shade插件来构建Uber jar,以将其作为作业提交给google dataproc集群。Google已在其集群上安装了Apache Spark 2.0.2 Apache Hadoop 2.7.3。
Apache spark 2.0.2使用com.google.guava的14.0.1和apache hadoop 2.7.3使用11.0.2,这两个都应该已经在类路径中。
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-shade-plugin</artifactId>
<version>3.0.0</version>
<executions>
<execution>
<phase>package</phase>
<goals>
<goal>shade</goal>
</goals>
<configuration>
<!--
<artifactSet>
<includes>
<include>com.google.guava:guava:jar:19.0</include>
</includes>
</artifactSet>
-->
<artifactSet>
<excludes>
<exclude>com.google.guava:guava:*</exclude>
</excludes>
</artifactSet>
</configuration>
</execution>
</executions>
</plugin>
Run Code Online (Sandbox Code Playgroud)
当我在阴影插件中包含番石榴16.0.1 jar时,我得到了这个Eexception:
Exception in thread "main" java.io.IOException: Failed to open native connection to Cassandra at {10.148.0.3}:9042
at com.datastax.spark.connector.cql.CassandraConnector$.com$datastax$spark$connector$cql$CassandraConnector$$createSession(CassandraConnector.scala:163)
at com.datastax.spark.connector.cql.CassandraConnector$$anonfun$3.apply(CassandraConnector.scala:149)
at com.datastax.spark.connector.cql.CassandraConnector$$anonfun$3.apply(CassandraConnector.scala:149)
at com.datastax.spark.connector.cql.RefCountedCache.createNewValueAndKeys(RefCountedCache.scala:31)
at com.datastax.spark.connector.cql.RefCountedCache.acquire(RefCountedCache.scala:56)
at com.datastax.spark.connector.cql.CassandraConnector.openSession(CassandraConnector.scala:82)
at com.datastax.spark.connector.cql.CassandraConnector.withSessionDo(CassandraConnector.scala:110)
at com.datastax.spark.connector.cql.CassandraConnector.withClusterDo(CassandraConnector.scala:121)
at com.datastax.spark.connector.cql.Schema$.fromCassandra(Schema.scala:322)
at com.datastax.spark.connector.cql.Schema$.tableFromCassandra(Schema.scala:342)
at com.datastax.spark.connector.rdd.CassandraTableRowReaderProvider$class.tableDef(CassandraTableRowReaderProvider.scala:50)
at …Run Code Online (Sandbox Code Playgroud) hadoop apache-spark spark-cassandra-connector google-cloud-dataproc