Spark:如何将 Spark arrayType 作为表达式进行迭代

Mic*_*chM 1 scala apache-spark

构建递归函数。

def loop(path: String, dt: DataType, acc:Seq[String]): Seq[String] = {
  dt match {
  case s: ArrayType => 
       s.fields.flatMap(f => loop(path + "." + f.name, f.dataType, acc))
  case s: StructType =>      
    s.fields.flatMap(f => loop(path + "." + f.name, f.dataType, acc))
  case other => 
    acc:+ path
}
Run Code Online (Sandbox Code Playgroud)

我有一个错误说“错误:值字段不是 org.apache.spark.sql.types.ArrayType 的成员”。那么如何迭代 arrayType 的每个元素并返回扁平化的字符串序列?

Mic*_*chM 5

诀窍是使用 .elementType

def loop(path: String, dt: DataType, acc:Seq[String]): Seq[String] = {
  dt match {
  case s: ArrayType =>
       loop(path, s.elementType, acc)
  case s: StructType =>      
    s.fields.flatMap(f => loop(path + "." + f.name, f.dataType, acc))
  case other => 
    acc:+ path
}
Run Code Online (Sandbox Code Playgroud)