如何将json转换为pyspark dataframe(更快的实现)

Rav*_*krn 0 json pyspark spark-dataframe pyspark-sql

我有{'abc':1,'def':2,'ghi':3}形式的json数据,如何在python中将其转换为pyspark数据框?

Nay*_*rma 8

import json
j = {'abc':1, 'def':2, 'ghi':3}
a=[json.dumps(j)]
jsonRDD = sc.parallelize(a)
df = spark.read.json(jsonRDD)

>>> df.show()
+---+---+---+
|abc|def|ghi|
+---+---+---+
|  1|  2|  3|
+---+---+---+
Run Code Online (Sandbox Code Playgroud)