Zha*_*ong 4 apache-spark apache-spark-sql pyspark spark-dataframe
一个非常庞大的 DataFrame with schema:
root
|-- id: string (nullable = true)
|-- ext: array (nullable = true)
| |-- element: integer (containsNull = true)
Run Code Online (Sandbox Code Playgroud)
到目前为止我尝试explode数据,然后collect_list:
select
id,
collect_list(cast(item as string))
from default.dual
lateral view explode(ext) t as item
group by
id
Run Code Online (Sandbox Code Playgroud)
但这种方式过于庞大.
Sil*_*vio 10
您可以简单地将ext列转换为字符串数组
df = source.withColumn("ext", source.ext.cast("array<string>"))
df.printSchema()
df.show()
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
2314 次 |
| 最近记录: |