NEO*_*NEO 0 scala maven apache-kafka apache-spark
当我尝试使用 Intellij IDEA 访问 Spark 流应用程序时
环境
Spark核心版本2.2.0 Intellij IDEA 2017.3.5版本
附加信息:Spark 正在 Yarn 模式下运行。
出现错误:
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
Exception in thread "main" java.lang.ExceptionInInitializerError
at kafka_stream.kafka_stream.main(kafka_stream.scala)
Caused by: org.apache.spark.SparkException: A master URL must be set in your configuration
at org.apache.spark.SparkContext.<init>(SparkContext.scala:376)
at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2509)
at org.apache.spark.sql.SparkSession$Builder$$anonfun$6.apply(SparkSession.scala:909)
at org.apache.spark.sql.SparkSession$Builder$$anonfun$6.apply(SparkSession.scala:901)
at scala.Option.getOrElse(Option.scala:121)
at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:901)
at kafka_stream.InitSpark$class.$init$(InitSpark.scala:15)
at kafka_stream.kafka_stream$.<init>(kafka_stream.scala:6)
at kafka_stream.kafka_stream$.<clinit>(kafka_stream.scala)
... 1 more
Process finished with exit code 1
Run Code Online (Sandbox Code Playgroud)
试过这个
val spark: SparkSession = SparkSession.builder()
.appName("SparkStructStream")
.master("spark://127.0.0.1:7077")
//.master("local[*]")
.getOrCreate()
Run Code Online (Sandbox Code Playgroud)
仍然遇到相同的 MASTER URL 错误
build.sbt 文件的内容
name := "KafkaSpark"
version := "1.0"
scalaVersion := "2.11.8"
libraryDependencies ++= Seq(
"org.apache.spark" % "spark-core_2.11" % "2.2.0",
"org.apache.spark" % "spark-sql_2.11" % "2.2.0",
"org.apache.spark" % "spark-streaming_2.11" % "2.2.0",
"org.apache.spark" % "spark-streaming-kafka_2.11" % "1.6.3"
)
// https://mvnrepository.com/artifact/org.apache.kafka/kafka_2.11
libraryDependencies += "org.apache.kafka" % "kafka_2.11" % "0.11.0.0"
// https://mvnrepository.com/artifact/org.apache.kafka/kafka-clients
libraryDependencies += "org.apache.kafka" % "kafka-clients" % "0.11.0.0"
// https://mvnrepository.com/artifact/org.apache.kafka/kafka-streams
libraryDependencies += "org.apache.kafka" % "kafka-streams" % "0.11.0.0"
// https://mvnrepository.com/artifact/org.apache.kafka/connect-api
libraryDependencies += "org.apache.kafka" % "connect-api" % "0.11.0.0"
libraryDependencies += "com.databricks" %% "spark-avro" % "4.0.0"
resolvers += Resolver.mavenLocal
resolvers += "central maven" at "https://repo1.maven.org/maven2/"
Run Code Online (Sandbox Code Playgroud)
对此有任何帮助将不胜感激吗?
看起来参数没有以某种方式传递。例如,火花是在较早的地方初始化的。不过,您可以尝试使用 VM 选项-Dspark.master=local[*],将参数传递到所有未定义的地方,因此它应该可以解决您的问题。在 IntelliJ 中它位于list of run config -> Edit Configurations... -> VM Options
| 归档时间: |
|
| 查看次数: |
6599 次 |
| 最近记录: |