我可以运行Spark shell bin/spark-shell --packages com.databricks:spark-xml_2.11:0.3.0
来分析xml文件,例如:
import org.apache.spark.sql.SQLContext
val sqlContext = new SQLContext(sc)
val df = sqlContext.read
.format("com.databricks.spark.xml")
.option("rowTag", "book")
.load("books.xml")
Run Code Online (Sandbox Code Playgroud)
但我怎么能运行Zeppelin来做到这一点.Zeppelin在开始导入时是否需要一些参数com.databricks.spark.xml
?现在我得到:
java.lang.RuntimeException:无法加载数据源的类:com.databricks.spark.xml at scala.sys.package $ .error(package.scala:27)at org.apache.spark.sql.sources.ResolvedDataSource $ .lookupDataSource(ddl.scala:220)atg.apache.spark.sql.sources.ResolvedDataSource $ .apply(ddl.scala:233)at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:114) at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:104)at $ iwC $$ iwC $$ iwC $$ iwC $$ iwC $$ iwC $$ iwC $$ iwC.(:26)at $ iwC $$ iwC $$ iwC $$ iwC $$ iwC $$ iwC $$ iwC.(:31)at $ iwC $$ iwC $$ iwC $$ iwC $$ iwC $$ iwC.(:33)at $ …