Spark2.2.1兼容Jackson版本2.8.8

Fob*_*obi 7 java eclipse scala maven apache-spark

我的配置是:

  • Scala 2.11(插件Scala IDE)
  • Eclipse Neon.3发布(4.6.3)
  • Windows 7 64位

我想运行这个简单的scala代码(Esempio.scala):

package it.scala

// importo packages di Spark
import org.apache.spark.SparkContext
import org.apache.spark.SparkConf


object Wordcount {
    def main(args: Array[String]) {

        val inputs: Array[String] = new Array[String](2)
        inputs(0) = "C:\\Users\\FobiDell\\Desktop\\input"
        inputs(1) = "C:\\Users\\FobiDell\\Desktop\\output"

        // oggetto SparkConf per settare i parametri sulla propria applicazione 
        // da fornire poi al cluster manager scelto (Yarn, Mesos o Standalone).
        val conf = new SparkConf()
        conf.setAppName("Smartphone Addiction")
        conf.setMaster("local")

        // oggetto SparkContext per connessione al cluster manager scelto
        val sc = new SparkContext(conf)

        //Read file and create RDD
        val rawData = sc.textFile(inputs(0))

        //convert the lines into words using flatMap operation
        val words = rawData.flatMap(line => line.split(" "))

        //count the individual words using map and reduceByKey operation
        val wordCount = words.map(word => (word, 1)).reduceByKey(_ + _)

        //Save the result
        wordCount.saveAsTextFile(inputs(1))

       //stop the spark context
       sc.stop

   }

}
Run Code Online (Sandbox Code Playgroud)

所以,如果我使用Spark-shell,那么从Eclipse IDE开始一切正常,如果我选择文件(Esempio.scala)并通过Run-> Run as-> Scala应用程序运行它,我会得到以下异常:

Exception in thread "main" java.lang.ExceptionInInitializerError
    at org.apache.spark.SparkContext.withScope(SparkContext.scala:701)
    at org.apache.spark.SparkContext.textFile(SparkContext.scala:830)
    at it.scala.Wordcount$.main(Esempio.scala:47)
    at it.scala.Wordcount.main(Esempio.scala)
Caused by: com.fasterxml.jackson.databind.JsonMappingException: Incompatible Jackson version: 2.8.8
    at com.fasterxml.jackson.module.scala.JacksonModule$class.setupModule(JacksonModule.scala:64)
    at com.fasterxml.jackson.module.scala.DefaultScalaModule.setupModule(DefaultScalaModule.scala:19)
    at com.fasterxml.jackson.databind.ObjectMapper.registerModule(ObjectMapper.java:745)
    at org.apache.spark.rdd.RDDOperationScope$.<init>(RDDOperationScope.scala:82)
    at org.apache.spark.rdd.RDDOperationScope$.<clinit>(RDDOperationScope.scala)
    ... 4 more  
Run Code Online (Sandbox Code Playgroud)

我的pom.xml文件是:

<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
  <modelVersion>4.0.0</modelVersion>

  <groupId>it.hgfhgf.xhgfghf</groupId>
  <artifactId>progetto</artifactId>
  <version>0.0.1-SNAPSHOT</version>
  <packaging>jar</packaging>

  <name>progetto</name>
  <url>http://maven.apache.org</url>

  <properties>
     <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
  </properties>

  <dependencies>
    <dependency>
      <groupId>junit</groupId>
      <artifactId>junit</artifactId>
      <version>3.8.1</version>
      <scope>test</scope>
    </dependency>

    <!-- Neo4j JDBC DRIVER -->
    <dependency>
      <groupId>org.neo4j</groupId>
      <artifactId>neo4j-jdbc-driver</artifactId>
      <version>3.1.0</version>
    </dependency>

    <!-- Scala -->
    <dependency>
      <groupId>org.scala-lang</groupId>
      <artifactId>scala-library</artifactId>
      <version>2.11.11</version>
    </dependency> 

    <!-- Spark -->
    <dependency>
      <groupId>org.apache.spark</groupId>
      <artifactId>spark-core_2.11</artifactId>
      <version>2.2.1</version>
    </dependency>


  </dependencies>


</project>
Run Code Online (Sandbox Code Playgroud)

我注意到spark-2.2.1-bin-hadoop2.7/jars目录中的.jar文件是:

  • 杰克逊核心2.6.5.jar
  • 杰克逊 - 数据绑定 - 2.6.5.jar
  • 杰克逊模块-paranamer-2.6.5.jar
  • 杰克逊模块-scala_2.11-2.6.5.jar
  • 杰克逊的注解 - 2.6.5.jar

任何人都可以用简单的语言向我解释这个例外是什么以及它如何解决?

sub*_*odh 20

Spark 2.x包含jackson 2.6.5neo4j-jdbc-driver使用jackson 2.8.8版本,这里是两个不同版本的jackson库之间的依赖冲突.这就是你得到这个Incompatible Jackson version: 2.8.8错误的原因.

尝试覆盖你内部的[下面]模块的依赖版本pom.xml,看看是否有效,

  1. 杰克逊核心
  2. 杰克逊 - 数据绑定
  3. 杰克逊模块-scala_2.x

或尝试将以下依赖项添加到您的pom.xml中

        <dependency>
            <groupId>com.fasterxml.jackson.module</groupId>
            <artifactId>jackson-module-scala_2.11</artifactId>
            <version>2.8.8</version>
        </dependency> 
Run Code Online (Sandbox Code Playgroud)


小智 14

不确定这是否对使用 scala 2.12 的 sbt 项目有问题的人有所帮助。放入jackson-module-scala_2.11不太好用。有一个 jackson-module-scala 2.6.7 版本,它有一个 Scala 2.12 版本

build.sbt 中的以下行有效

dependencyOverrides ++= {
  Seq(
    "com.fasterxml.jackson.module" %% "jackson-module-scala" % "2.6.7.1",
    "com.fasterxml.jackson.core" % "jackson-databind" % "2.6.7",
    "com.fasterxml.jackson.core" % "jackson-core" % "2.6.7"
  )
}
Run Code Online (Sandbox Code Playgroud)

这解决了 spark 2.4.5 的问题