Kar*_*way 5 guava emr datastax apache-spark
我在emr上运行spark job并使用datastax连接器连接到cassandra集群.我正面临着番石榴罐的问题,请找到下面的详细信息,我在cassandra deps下面使用
cqlsh 5.0.1 | Cassandra 3.0.1 | CQL spec 3.3.1
Run Code Online (Sandbox Code Playgroud)
使用以下maven deps在EMR 4.4上运行spark工作
org.apache.spark spark-streaming_2.10 1.5.0
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.10</artifactId>
<version>1.5.0</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId><dependency>
<groupId>com.datastax.spark</groupId>
<artifactId>spark-cassandra-connector_2.10</artifactId>
<version>1.5.0</version>
</dependency>
<artifactId>spark-streaming-kinesis-asl_2.10</artifactId>
<version>1.5.0</version>
</dependency>
Run Code Online (Sandbox Code Playgroud)
当我提交火花工作时遇到问题如下
ava.lang.ExceptionInInitializerError
at com.datastax.spark.connector.cql.DefaultConnectionFactory$.clusterBuilder(CassandraConnectionFactory.scala:35)
at com.datastax.spark.connector.cql.DefaultConnectionFactory$.createCluster(CassandraConnectionFactory.scala:87)
at com.datastax.spark.connector.cql.CassandraConnector$.com$datastax$spark$connector$cql$CassandraConnector$$createSession(CassandraConnector.scala:153)
at com.datastax.spark.connector.cql.CassandraConnector$$anonfun$2.apply(CassandraConnector.scala:148)
at com.datastax.spark.connector.cql.CassandraConnector$$anonfun$2.apply(CassandraConnector.scala:148)
at com.datastax.spark.connector.cql.RefCountedCache.createNewValueAndKeys(RefCountedCache.scala:31)
at com.datastax.spark.connector.cql.RefCountedCache.acquire(RefCountedCache.scala:56)
at com.datastax.spark.connector.cql.CassandraConnector.openSession(CassandraConnector.scala:81)
at ampush.event.process.core.CassandraServiceManagerImpl.getAdMetaInfo(CassandraServiceManagerImpl.java:158)
at ampush.event.config.metric.processor.ScheduledEventAggregator$4.call(ScheduledEventAggregator.java:308)
at ampush.event.config.metric.processor.ScheduledEventAggregator$4.call(ScheduledEventAggregator.java:290)
at org.apache.spark.api.java.JavaRDDLike$$anonfun$foreachPartition$1.apply(JavaRDDLike.scala:222)
at org.apache.spark.api.java.JavaRDDLike$$anonfun$foreachPartition$1.apply(JavaRDDLike.scala:222)
at org.apache.spark.rdd.RDD$$anonfun$foreachPartition$1$$anonfun$apply$29.apply(RDD.scala:902)
at org.apache.spark.rdd.RDD$$anonfun$foreachPartition$1$$anonfun$apply$29.apply(RDD.scala:902)
at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1850)
at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1850)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
at org.apache.spark.scheduler.Task.run(Task.scala:88)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.IllegalStateException: Detected Guava issue #1635 which indicates that a version of Guava less than 16.01 is in use. This introduces codec resolution issues and potentially other incompatibility issues in the driver. Please upgrade to Guava 16.01 or later.
at com.datastax.driver.core.SanityChecks.checkGuava(SanityChecks.java:62)
at com.datastax.driver.core.SanityChecks.check(SanityChecks.java:36)
at com.datastax.driver.core.Cluster.<clinit>(Cluster.java:67)
... 23 more
Run Code Online (Sandbox Code Playgroud)
请让我知道如何在这里管理番石榴镇?
谢谢
另一个解决方案,转到目录
火花/瓶
.重命名guava-14.0.1.jar然后guava-19.0.jar像这张图片一样复制:
小智 5
我遇到了同样的问题,并通过使用 maven Shade 插件对 Cassandra 连接器引入的 guava 版本进行着色来解决它。
我需要显式排除Optional、Present 和Absent 类,因为我在Spark 尝试从非阴影Guava Present 类型转换为阴影Optional 类型时遇到了问题。我不确定这是否会在以后引起任何问题,但它现在似乎对我有用。
您可以将其添加到<plugins>pom.xml 中的部分:
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-shade-plugin</artifactId>
<version>2.4.3</version>
<executions>
<execution>
<phase>package</phase>
<goals>
<goal>
shade
</goal>
</goals>
</execution>
</executions>
<configuration>
<minimizeJar>true</minimizeJar>
<shadedArtifactAttached>true</shadedArtifactAttached>
<shadedClassifierName>fat</shadedClassifierName>
<relocations>
<relocation>
<pattern>com.google</pattern>
<shadedPattern>shaded.guava</shadedPattern>
<includes>
<include>com.google.**</include>
</includes>
<excludes>
<exclude>com.google.common.base.Optional</exclude>
<exclude>com.google.common.base.Absent</exclude>
<exclude>com.google.common.base.Present</exclude>
</excludes>
</relocation>
</relocations>
<filters>
<filter>
<artifact>*:*</artifact>
<excludes>
<exclude>META-INF/*.SF</exclude>
<exclude>META-INF/*.DSA</exclude>
<exclude>META-INF/*.RSA</exclude>
</excludes>
</filter>
</filters>
</configuration>
</plugin>
Run Code Online (Sandbox Code Playgroud)
只需在 POM<dependencies>块中添加如下内容:
<dependency>
<groupId>com.google.guava</groupId>
<artifactId>guava</artifactId>
<version>19.0</version>
</dependency>
Run Code Online (Sandbox Code Playgroud)
(或任何您喜欢的 > 16.0.1 版本)
| 归档时间: |
|
| 查看次数: |
6428 次 |
| 最近记录: |