Spark 1.5.2:org.apache.spark.sql.AnalysisException:unresolved operator'Union;

Nee*_*eel 16 apache-spark

我有两个数据帧df1df2.它们都有以下架构:

 |-- ts: long (nullable = true)
 |-- id: integer (nullable = true)
 |-- managers: array (nullable = true)
 |    |-- element: string (containsNull = true)
 |-- projects: array (nullable = true)
 |    |-- element: string (containsNull = true)
Run Code Online (Sandbox Code Playgroud)

df1是从avro文件创建而df2来自等效的镶木地板文件.但是,如果我执行,df1.unionAll(df2).show()我收到以下错误:

    org.apache.spark.sql.AnalysisException: unresolved operator 'Union;
    at org.apache.spark.sql.catalyst.analysis.CheckAnalysis$class.failAnalysis(CheckAnalysis.scala:37)
    at org.apache.spark.sql.catalyst.analysis.Analyzer.failAnalysis(Analyzer.scala:44)
    at org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$checkAnalysis$1.apply(CheckAnalysis.scala:174)
    at org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$checkAnalysis$1.apply(CheckAnalysis.scala:49)
    at org.apache.spark.sql.catalyst.trees.TreeNode.foreachUp(TreeNode.scala:103)
Run Code Online (Sandbox Code Playgroud)

Eve*_*eng 21

我遇到了同样的情况,事实证明,不仅字段需要相同,而且还需要在两个数据帧中保持字段的完全相同的顺序,以使其工作.

  • 以及字段的数据类型 (4认同)