相关疑难解决方法(0)

使用Spark`DataFrame`的`unionAll`出了什么问题?

使用Spark 1.5.0并给出以下代码,我希望unionAll DataFrame基于它们的列名进行联合.在代码中,我使用一些FunSuite传递SparkContext sc:

object Entities {

  case class A (a: Int, b: Int)
  case class B (b: Int, a: Int)

  val as = Seq(
    A(1,3),
    A(2,4)
  )

  val bs = Seq(
    B(5,3),
    B(6,4)
  )
}

class UnsortedTestSuite extends SparkFunSuite {

  configuredUnitTest("The truth test.") { sc =>
    val sqlContext = new SQLContext(sc)
    import sqlContext.implicits._
    val aDF = sc.parallelize(Entities.as, 4).toDF
    val bDF = sc.parallelize(Entities.bs, 4).toDF
    aDF.show()
    bDF.show()
    aDF.unionAll(bDF).show
  }
}
Run Code Online (Sandbox Code Playgroud)

输出:

+---+---+
|  a|  b|
+---+---+
|  1|  3| …
Run Code Online (Sandbox Code Playgroud)

scala dataframe apache-spark apache-spark-sql

20
推荐指数
1
解决办法
3万
查看次数

标签 统计

apache-spark ×1

apache-spark-sql ×1

dataframe ×1

scala ×1