Hadoop依赖于两个不同版本的beanutils

SRo*_*mes 5 hadoop sbt sbt-assembly

Hadoop 2.4.0依赖于两个不同版本的beanutils,导致以下错误sbt-assembly:

[error] (*:assembly) deduplicate: different file contents found in the following:
[error] .ivy2/cache/commons-beanutils/commons-beanutils/jars/commons-beanutils-1.7.0.jar:org/apache/commons/beanutils/BasicDynaBean.class
[error] .ivy2/cache/commons-beanutils/commons-beanutils-core/jars/commons-beanutils-core-1.8.0.jar:org/apache/commons/beanutils/BasicDynaBean.class
Run Code Online (Sandbox Code Playgroud)

这两个依赖项都可以从Hadoop 2.4.0传递,正如使用如何直接访问Ivy,即访问依赖性报告或执行Ivy命令所证实的那样

如何制作包含Hadoop 2.4.0的sbt-assembly?

更新:根据要求,这是build.sbt依赖项:

libraryDependencies += "org.apache.hadoop" % "hadoop-client" % "2.4.0"

libraryDependencies += "org.apache.spark" %% "spark-core" % "1.0.0"  % "provided" exclude("org.apache.hadoop", "hadoop-client")

resolvers += "Akka Repository" at "http://repo.akka.io/releases/"

libraryDependencies += "com.amazonaws" % "aws-java-sdk" % "1.7.8"

libraryDependencies += "commons-io" % "commons-io" % "2.4"

libraryDependencies += "javax.servlet" % "javax.servlet-api" % "3.0.1" % "provided"

libraryDependencies += "com.sksamuel.elastic4s" %% "elastic4s" % "1.1.1.0"
Run Code Online (Sandbox Code Playgroud)

exclude hadoop是必需的,因为开箱即用,Spark包含Hadoop 1,它与Hadoop 2冲突.

小智 2

尝试将合并策略添加到build.sbt

像下面这样

val meta = """META.INF(.)*""".r

mergeStrategy in assembly <<= (mergeStrategy in assembly) { (old) =>
  {
    case PathList("javax", "servlet", xs @ _*) => MergeStrategy.last
    case PathList("javax", "activation", xs @ _*) => MergeStrategy.last
    case PathList("org", "apache", xs @ _*) => MergeStrategy.last
    case PathList("com", "esotericsoftware", xs @ _*) => MergeStrategy.last
    case PathList("plugin.properties") => MergeStrategy.last
    case meta(_) => MergeStrategy.discard
    case x => old(x)
  }
}
Run Code Online (Sandbox Code Playgroud)