Spark流式弹性搜索依赖关系

Hed*_*edi 7 twitter streaming scala elasticsearch apache-spark

我在Elasticsearch指南中描述了在Scala中尝试Spark&Elasticsearch集成

编译时我遇到依赖项问题:

[trace] Stack trace suppressed: run last *:update for the full output.
[error] (*:update) sbt.ResolveException: unresolved dependency: cascading#ing-local;2.5.6: not found
[error] unresolved dependency: clj-time#clj-time;0.4.1: not found
[error] unresolved dependency: compojure#compojure;1.1.3: not found
[error] unresolved dependency: hiccup#hiccup;0.3.6: not found
[error] unresolved dependency: ring#ring-devel;0.3.11: not found
[error] unresolved dependency: ring#ring-jetty-adapter;0.3.11: not found
[error] unresolved dependency: com.twitter#carbonite;1.4.0: not found
[error] unresolved dependency: cascading#cascading-hadoop;2.5.6: not found
[error] Total time: 86 s, completed 19 nov. 2014 08:42:58
Run Code Online (Sandbox Code Playgroud)

我的build.sbt文件看起来像这样

name := "twitter-sparkstreaming-elasticsearch"

version := "0.0.1"

scalaVersion := "2.10.4"

// additional libraries
libraryDependencies ++= Seq(
  "org.apache.spark" %% "spark-core" % "1.1.0",
  "org.apache.spark" %% "spark-streaming" % "1.1.0",
  "org.apache.spark" %% "spark-streaming-twitter" % "1.1.0",
  "org.elasticsearch" % "elasticsearch-hadoop" % "2.1.0"
)
Run Code Online (Sandbox Code Playgroud)

救命?谢谢.

Hed*_*edi 9

级联及其依赖关系在Maven中心不可用,但在他们自己的repo中(es-hadoop无法通过其pom指定).

我使用elasticsearch-spark_2.10解决了这个问题

http://www.elasticsearch.org/guide/en/elasticsearch/hadoop/master/install.html


pba*_*mba 7

Sbt无法解析某些依赖项,因为它们不是Maven存储库的一部分.但是,你可以在clojarsconjars上找到它们.您需要添加以下行,以便sbt可以解决它们:

resolvers += "clojars" at "https://clojars.org/repo"
resolvers += "conjars" at "http://conjars.org/repo"
Run Code Online (Sandbox Code Playgroud)

而且,依赖性elasticsearch-hadoop"2.1.0"不存在(还有吗?),你应该使用"2.1.0.Beta4"(或者你读这篇文章时的最新版本)

您的sbt文件应如下所示:

name := "twitter-sparkstreaming-elasticsearch"

version := "0.0.1"

scalaVersion := "2.10.4"

libraryDependencies ++= Seq(
    "org.apache.spark" %% "spark-core" % "1.1.0",
    "org.apache.spark" %% "spark-streaming" % "1.1.0",
    "org.apache.spark" %% "spark-streaming-twitter" % "1.1.0",
    "org.elasticsearch" % "elasticsearch-hadoop" % "2.1.0.Beta4"
)

resolvers += "clojars" at "https://clojars.org/repo"
resolvers += "conjars" at "http://conjars.org/repo"
Run Code Online (Sandbox Code Playgroud)

这已经过测试(使用spark-core 1.3.1并且没有火花流,但它应该适合你).希望能帮助到你.