使用案例类编码JSON时,为什么错误"无法找到存储在数据集中的类型的编码器"?

Mil*_*avi 15 scala apache-spark apache-spark-dataset apache-spark-encoders

我写过火花工作:

object SimpleApp {
  def main(args: Array[String]) {
    val conf = new SparkConf().setAppName("Simple Application").setMaster("local")
    val sc = new SparkContext(conf)
    val ctx = new org.apache.spark.sql.SQLContext(sc)
    import ctx.implicits._

    case class Person(age: Long, city: String, id: String, lname: String, name: String, sex: String)
    case class Person2(name: String, age: Long, city: String)

    val persons = ctx.read.json("/tmp/persons.json").as[Person]
    persons.printSchema()
  }
}
Run Code Online (Sandbox Code Playgroud)

在IDE中运行main函数时,发生2错误:

Error:(15, 67) Unable to find encoder for type stored in a Dataset.  Primitive types (Int, String, etc) and Product types (case classes) are supported by importing sqlContext.implicits._  Support for serializing other types will be added in future releases.
    val persons = ctx.read.json("/tmp/persons.json").as[Person]
                                                                  ^

Error:(15, 67) not enough arguments for method as: (implicit evidence$1: org.apache.spark.sql.Encoder[Person])org.apache.spark.sql.Dataset[Person].
Unspecified value parameter evidence$1.
    val persons = ctx.read.json("/tmp/persons.json").as[Person]
                                                                  ^
Run Code Online (Sandbox Code Playgroud)

但是在Spark Shell中,我可以毫无错误地运行这个作业.问题是什么?

小智 33

错误消息表明Encoder无法接受Person案例类.

Error:(15, 67) Unable to find encoder for type stored in a Dataset.  Primitive types (Int, String, etc) and Product types (case classes) are supported by importing sqlContext.implicits._  Support for serializing other types will be added in future releases.
Run Code Online (Sandbox Code Playgroud)

移动case类的声明超出范围SimpleApp.

  • 为什么范围界定会有所不同?我在使用REPL时遇到了这个错误. (14认同)