我有一个 multiLine json 文件,我使用 spark 的 read.json 来读取 json,问题是它只从那个 json 文件中读取第一个对象
val dataFrame = spark.read.option("multiLine", true).option("mode", "PERMISSIVE").json(path)
dataFrame.rdd.saveAsTextFile("DataFrame")
Run Code Online (Sandbox Code Playgroud)
示例json:
{
"_id" : "589895e123c572923e69f5e7",
"thing" : "54eb45beb5f1e061454c5bf4",
"timeline" : [
{
"reason" : "TRIP_START",
"timestamp" : "2017-02-06T17:20:18.007+02:00",
"type" : "TRIP_EVENT",
"location" : [
11.1174091,
69.1174091
],
"endLocation" : [],
"startLocation" : []
},
"reason" : "TRIP_END",
"timestamp" : "2017-02-06T17:25:26.026+02:00",
"type" : "TRIP_EVENT",
"location" : [
11.5691428,
48.1122443
],
"endLocation" : [],
"startLocation" : []
}
],
"__v" : 0
}
{ …Run Code Online (Sandbox Code Playgroud)