我想DataFrame在Scala中使用指定的模式创建.我曾尝试使用JSON读取(我的意思是读取空文件),但我认为这不是最好的做法.
第一个 Df 是:
ID Name ID2 Marks
1 12 1 333
Run Code Online (Sandbox Code Playgroud)
第二个 Df2 是:
ID Name ID2 Marks
1 3 989
7 98 8 878
Run Code Online (Sandbox Code Playgroud)
我需要的输出是:
ID Name ID2 Marks
1 12 1 333
1 3 989
7 98 8 878
Run Code Online (Sandbox Code Playgroud)
请帮助!
这是我的联盟代码:
val dfToSave=dfMainOutput.union(insertdf.select(dfMainOutput).withColumn("FFAction", when($"FFAction" === "O" || $"FFAction" === "I", lit("I|!|")))
Run Code Online (Sandbox Code Playgroud)
当我结合时,我得到以下错误:
org.apache.spark.sql.AnalysisException: Union can only be performed on tables with the compatible column types. string <> boolean at the 11th column of the second table;;
'Union
Run Code Online (Sandbox Code Playgroud)
这是两个数据帧的模式:
insertdf.printSchema()
root
|-- OrganizationID: long (nullable = true)
|-- SourceID: integer (nullable = true)
|-- AuditorID: integer (nullable = true)
|-- AuditorOpinionCode: string (nullable = true)
|-- AuditorOpinionOnInternalControlCode: string (nullable = true)
|-- AuditorOpinionOnGoingConcernCode: string (nullable = true)
|-- IsPlayingAuditorRole: boolean (nullable …Run Code Online (Sandbox Code Playgroud)