Yas*_*sir 3 scala sbt apache-spark graphframes
我有以下SBT文件,我正在使用Apache GraphFrame编译Scala代码并读取CSV文件.
name := "Simple"
version := "1.0"
scalaVersion := "2.10.5"
libraryDependencies ++= Seq(
"org.apache.spark" %% "spark-core" % "1.6.1",
"graphframes" % "graphframes" % "0.2.0-spark1.6-s_2.10",
"org.apache.spark" %% "spark-sql" % "1.0.0",
"com.databricks" % "spark-csv" % "1.0.3"
)
Run Code Online (Sandbox Code Playgroud)
这是我在斯卡拉的代码
import org.graphframes._
import org.apache.spark.sql.DataFrame
val nodesList = sqlContext.read.format("com.databricks.spark.csv").option("header", "true").option("inferSchema", "true").load("/Users/Desktop/GraphFrame/NodesList.csv")
val edgesList= sqlContext.read.format("com.databricks.spark.csv").option("header", "true").option("inferSchema", "true").load("/Users/Desktop/GraphFrame/EdgesList.csv")
val v=nodesList.toDF("id", "name")
val e=edgesList.toDF("src", "dst", "dist")
val g = GraphFrame(v, e)
Run Code Online (Sandbox Code Playgroud)
当我尝试使用SBT制作Jar文件时,它在编译期间给出了以下错误
[trace] Stack trace suppressed: run last *:update for the full output.
[error] (*:update) sbt.ResolveException: unresolved dependency: graphframes#graphframes;0.2.0-spark1.6-s_2.10: not found
[error] Total time:
Run Code Online (Sandbox Code Playgroud)
GraphFrames尚未在Maven Central存储库中.
您可以:
build.sbt:build.sbt中的代码:
resolvers += Resolver.url("SparkPackages", url("https://dl.bintray.com/spark-packages/maven/"))
Run Code Online (Sandbox Code Playgroud)
出于某种原因,Gaw?da 的回答中提到的 Resolver.url 对我不起作用,以下内容有效:
resolvers += "SparkPackages" at "https://dl.bintray.com/spark-packages/maven"
libraryDependencies += "graphframes" % "graphframes" % "0.7.0-spark2.4-s_2.11"