java.lang.NoSuchMethodError Jackson databind和Spark

use*_*071 8 json scala jackson apache-spark

我试图用Spark 1.1.0和Jackson 2.4.4运行spark-submit.我有scala代码,它使用Jackson将JSON反序列化为case类.这本身就可以正常工作,但是当我使用它时,我得到以下错误:

15/05/01 17:50:11 ERROR Executor: Exception in task 0.0 in stage 1.0 (TID 2)
java.lang.NoSuchMethodError: com.fasterxml.jackson.databind.introspect.POJOPropertyBuilder.addField(Lcom/fasterxml/jackson/databind/introspect/AnnotatedField;Lcom/fasterxml/jackson/databind/PropertyName;ZZZ)V
    at com.fasterxml.jackson.module.scala.introspect.ScalaPropertiesCollector.com$fasterxml$jackson$module$scala$introspect$ScalaPropertiesCollector$$_addField(ScalaPropertiesCollector.scala:109)
    at com.fasterxml.jackson.module.scala.introspect.ScalaPropertiesCollector$$anonfun$_addFields$2$$anonfun$apply$11.apply(ScalaPropertiesCollector.scala:100)
    at com.fasterxml.jackson.module.scala.introspect.ScalaPropertiesCollector$$anonfun$_addFields$2$$anonfun$apply$11.apply(ScalaPropertiesCollector.scala:99)
    at scala.Option.foreach(Option.scala:236)
    at com.fasterxml.jackson.module.scala.introspect.ScalaPropertiesCollector$$anonfun$_addFields$2.apply(ScalaPropertiesCollector.scala:99)
    at com.fasterxml.jackson.module.scala.introspect.ScalaPropertiesCollector$$anonfun$_addFields$2.apply(ScalaPropertiesCollector.scala:93)
    at scala.collection.GenTraversableViewLike$Filtered$$anonfun$foreach$4.apply(GenTraversableViewLike.scala:109)
    at scala.collection.Iterator$class.foreach(Iterator.scala:727)
    at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
    at scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
    at scala.collection.SeqLike$$anon$2.foreach(SeqLike.scala:635)
    at scala.collection.GenTraversableViewLike$Filtered$class.foreach(GenTraversableViewLike.scala:108)
    at scala.collection.SeqViewLike$$anon$5.foreach(SeqViewLike.scala:80)
    at com.fasterxml.jackson.module.scala.introspect.ScalaPropertiesCollector._addFields(ScalaPropertiesCollector.scala:93)
Run Code Online (Sandbox Code Playgroud)

这是我的build.sbt:

//scalaVersion in ThisBuild := "2.11.4"
scalaVersion in ThisBuild := "2.10.5"

retrieveManaged := true

libraryDependencies += "org.scala-lang" % "scala-reflect" % scalaVersion.value

libraryDependencies ++= Seq(
  "junit" % "junit" % "4.12" % "test",
  "org.scalatest" %% "scalatest" % "2.2.4" % "test",
  "org.mockito" % "mockito-core" % "1.9.5",
  "org.specs2" %% "specs2" % "2.1.1" % "test",
  "org.scalatest" %% "scalatest" % "2.2.4" % "test"
)

libraryDependencies ++= Seq(
  "org.apache.hadoop" % "hadoop-core" % "0.20.2",
  "org.apache.hbase" % "hbase" % "0.94.6"
)

//libraryDependencies += "org.apache.spark" % "spark-core_2.10" % "1.3.0"
libraryDependencies += "org.apache.spark" % "spark-core_2.10" % "1.1.0"


libraryDependencies += "com.fasterxml.jackson.module" %% "jackson-module-scala" % "2.4.4"
//libraryDependencies += "com.fasterxml.jackson.module" %% "jackson-module-scala" % "2.3.1"
//libraryDependencies += "com.fasterxml.jackson.module" %% "jackson-module-scala" % "2.5.0"

libraryDependencies += "com.typesafe" % "config" % "1.2.1"

resolvers += Resolver.mavenLocal
Run Code Online (Sandbox Code Playgroud)

如你所见,我尝试了许多不同版本的杰克逊.

这是我用来运行spark submit的shell脚本:

#!/bin/bash
sbt package

CLASS=com.org.test.spark.test.SparkTest

SPARKDIR=/Users/user/Desktop/
#SPARKVERSION=1.3.0
SPARKVERSION=1.1.0
SPARK="$SPARKDIR/spark-$SPARKVERSION/bin/spark-submit"

jar_jackson=/Users/user/scala_projects/lib_managed/bundles/com.fasterxml.jackson.module/jackson-module-scala_2.10/jackson-module-scala_2.10-2.4.4.jar

"$SPARK" \
  --class "$CLASS" \
  --jars $jar_jackson \
  --master local[4] \
  /Users/user/scala_projects/target/scala-2.10/spark_project_2.10-0.1-SNAPSHOT.jar \
  print /Users/user/test.json
Run Code Online (Sandbox Code Playgroud)

我使用--jarsjackson jar的路径到spark-submit命令.我甚至尝试过不同版本的Spark.我甚至还指定了Jackson jars数据绑定,注释等的路径,但这并没有解决问题.任何帮助,将不胜感激.谢谢

小智 6

我有同样的问题,我的play-json jar使用jackson 2.3.2并且spark使用的是jackson 2.4.4.
当我运行spark应用程序时,它无法在jackson-2.3.2中找到该方法,我得到了相同的异常.

我检查了杰克逊的maven依赖层次结构.它显示了它所采用的版本和哪个jar(这里播放使用2.3.2)和我的play-json首先放在依赖列表中,它花了2.3.2版本.

所以我尝试将play依赖项放在所有依赖项的末尾/在spark依赖之后,它运行得很好.这次花了2.4.4,省略了版本2.3.2.

来源:

请注意,如果两个依赖关系版本在依赖关系树中处于相同的深度,则直到Maven 2.0.8没有定义哪一个会赢,但是自Maven 2.0.9开始,它就是声明中的顺序:第一个声明获胜.


che*_*gpu 0

我认为主要原因是您没有指定正确的依赖项。

如果您使用第 3 方库然后submit to Spark直接使用,更好的方法是使用sbt-assemblyhttps://github.com/sbt/sbt-assemble)。