Aka*_*thi 12 scala apache-spark spark-dataframe
我有DataFrame3列,即Id, First Name, Last Name
我想申请GroupBy的基础上,Id并希望收集First Name, Last Name列作为列表.
示例: - 我有这样的DF
+---+-------+--------+
|id |fName  |lName   |
+---+-------+--------+
|1  |Akash  |Sethi   |
|2  |Kunal  |Kapoor  |
|3  |Rishabh|Verma   |
|2  |Sonu   |Mehrotra|
+---+-------+--------+
我希望我的输出像这样
+---+-------+--------+--------------------+
|id |fname           |lName               |
+---+-------+--------+--------------------+
|1  |[Akash]         |[Sethi]             |
|2  |[Kunal, Sonu]   |[Kapoor, Mehrotra]  |
|3  |[Rishabh]       |[Verma]             |
+---+-------+--------+--------------------+
提前致谢
him*_*ian 12
您可以聚合多个列,如下所示:
df.groupBy("id").agg(collect_list("fName"), collect_list("lName"))
它会给你预期的结果.
| 归档时间: | 
 | 
| 查看次数: | 7555 次 | 
| 最近记录: |