我试图通过数据采集器上的spark-shell使用spark-cassandra-connector,但是我无法连接到我的集群.似乎版本不匹配,因为类路径包含来自其他地方的更古老的番石榴版本,即使我在启动时指定了正确的版本.我怀疑这可能是由默认情况下放入类路径的所有Hadoop依赖项引起的.
反正有没有火花壳只使用适当版本的番石榴,而没有摆脱所有与Hadoop相关的数据包包括罐子?
相关数据:
启动spark-shell,显示它具有适当版本的Guava: $ spark-shell --packages com.datastax.spark:spark-cassandra-connector_2.10:1.5.0-M3
:: loading settings :: url = jar:file:/usr/lib/spark/lib/spark-assembly-1.5.2-hadoop2.7.1.jar!/org/apache/ivy/core/settings/ivysettings.xml
com.datastax.spark#spark-cassandra-connector_2.10 added as a dependency
:: resolving dependencies :: org.apache.spark#spark-submit-parent;1.0
confs: [default]
found com.datastax.spark#spark-cassandra-connector_2.10;1.5.0-M3 in central
found org.apache.cassandra#cassandra-clientutil;2.2.2 in central
found com.datastax.cassandra#cassandra-driver-core;3.0.0-alpha4 in central
found io.netty#netty-handler;4.0.27.Final in central
found io.netty#netty-buffer;4.0.27.Final in central
found io.netty#netty-common;4.0.27.Final in central
found io.netty#netty-transport;4.0.27.Final in central
found io.netty#netty-codec;4.0.27.Final in central
found com.codahale.metrics#metrics-core;3.0.2 in central
found org.slf4j#slf4j-api;1.7.5 in central
found org.apache.commons#commons-lang3;3.3.2 in central
found com.google.guava#guava;16.0.1 in central
found org.joda#joda-convert;1.2 in central
found …
Run Code Online (Sandbox Code Playgroud) apache-spark spark-cassandra-connector google-cloud-dataproc