Aka*_*thi 12 scala apache-spark spark-dataframe
我有DataFrame3列,即Id, First Name, Last Name
我想申请GroupBy的基础上,Id并希望收集First Name, Last Name列作为列表.
示例: - 我有这样的DF
+---+-------+--------+
|id |fName |lName |
+---+-------+--------+
|1 |Akash |Sethi |
|2 |Kunal |Kapoor |
|3 |Rishabh|Verma |
|2 |Sonu |Mehrotra|
+---+-------+--------+
Run Code Online (Sandbox Code Playgroud)
我希望我的输出像这样
+---+-------+--------+--------------------+
|id |fname |lName |
+---+-------+--------+--------------------+
|1 |[Akash] |[Sethi] |
|2 |[Kunal, Sonu] |[Kapoor, Mehrotra] |
|3 |[Rishabh] |[Verma] |
+---+-------+--------+--------------------+
Run Code Online (Sandbox Code Playgroud)
提前致谢
him*_*ian 12
您可以聚合多个列,如下所示:
df.groupBy("id").agg(collect_list("fName"), collect_list("lName"))
Run Code Online (Sandbox Code Playgroud)
它会给你预期的结果.
| 归档时间: |
|
| 查看次数: |
7555 次 |
| 最近记录: |