小编dee*_*t99的帖子

将Uber Jar提交给Google Dataproc时如何解决Guava依赖问题

我正在使用Maven Shade插件来构建Uber jar,以将其作为作业提交给google dataproc集群。Google已在其集群上安装了Apache Spark 2.0.2 Apache Hadoop 2.7.3。

Apache spark 2.0.2使用com.google.guava的14.0.1和apache hadoop 2.7.3使用11.0.2,这两个都应该已经在类路径中。

<plugin>
            <groupId>org.apache.maven.plugins</groupId>
            <artifactId>maven-shade-plugin</artifactId>
            <version>3.0.0</version>
            <executions>
                <execution>
                    <phase>package</phase>
                    <goals>
                        <goal>shade</goal>
                    </goals>
                    <configuration>
                    <!--  
                        <artifactSet>
                            <includes>
                                <include>com.google.guava:guava:jar:19.0</include>
                            </includes>
                        </artifactSet>
                    -->
                        <artifactSet>
                            <excludes>
                                <exclude>com.google.guava:guava:*</exclude>                                 
                            </excludes>
                        </artifactSet>
                    </configuration>
                </execution>
            </executions>
        </plugin>
Run Code Online (Sandbox Code Playgroud)

当我在阴影插件中包含番石榴16.0.1 jar时,我得到了这个Eexception:

Exception in thread "main" java.io.IOException: Failed to open native connection to Cassandra at {10.148.0.3}:9042
at com.datastax.spark.connector.cql.CassandraConnector$.com$datastax$spark$connector$cql$CassandraConnector$$createSession(CassandraConnector.scala:163)
at com.datastax.spark.connector.cql.CassandraConnector$$anonfun$3.apply(CassandraConnector.scala:149)
at com.datastax.spark.connector.cql.CassandraConnector$$anonfun$3.apply(CassandraConnector.scala:149)
at com.datastax.spark.connector.cql.RefCountedCache.createNewValueAndKeys(RefCountedCache.scala:31)
at com.datastax.spark.connector.cql.RefCountedCache.acquire(RefCountedCache.scala:56)
at com.datastax.spark.connector.cql.CassandraConnector.openSession(CassandraConnector.scala:82)
at com.datastax.spark.connector.cql.CassandraConnector.withSessionDo(CassandraConnector.scala:110)
at com.datastax.spark.connector.cql.CassandraConnector.withClusterDo(CassandraConnector.scala:121)
at com.datastax.spark.connector.cql.Schema$.fromCassandra(Schema.scala:322)
at com.datastax.spark.connector.cql.Schema$.tableFromCassandra(Schema.scala:342)
at com.datastax.spark.connector.rdd.CassandraTableRowReaderProvider$class.tableDef(CassandraTableRowReaderProvider.scala:50)
at …
Run Code Online (Sandbox Code Playgroud)

hadoop apache-spark spark-cassandra-connector google-cloud-dataproc

1
推荐指数
1
解决办法
1511
查看次数