Fxs*_*576 7 python json google-bigquery google-cloud-platform
我的目标是将JSON文件转换为可以使用Python 从Cloud Storage上传到BigQuery(如此处所述)的格式.
我尝试使用newlineJSON包进行转换,但收到以下错误.
JSONDecodeError: Expecting value or ']': line 2 column 1 (char 5)
Run Code Online (Sandbox Code Playgroud)
有人有解决方案吗?
以下是示例JSON代码:
[{
"key01": "value01",
"key02": "value02",
...
"keyN": "valueN"
},
{
"key01": "value01",
"key02": "value02",
...
"keyN": "valueN"
},
{
"key01": "value01",
"key02": "value02",
...
"keyN": "valueN"
}
]
Run Code Online (Sandbox Code Playgroud)
这是现有的python脚本:
with nlj.open(url_samplejson, json_lib = "simplejson") as src_:
with nlj.open(url_convertedjson, "w") as dst_:
for line_ in src_:
dst_.write(line_)
Run Code Online (Sandbox Code Playgroud)
Ole*_*nko 10
答案jq是非常有用的,但是如果你仍然想用Python(从问题看来),你可以使用内置json模块.
import json
from io import StringIO
in_json = StringIO("""[{
"key01": "value01",
"key02": "value02",
"keyN": "valueN"
},
{
"key01": "value01",
"key02": "value02",
"keyN": "valueN"
},
{
"key01": "value01",
"key02": "value02",
"keyN": "valueN"
}
]""")
result = [json.dumps(record) for record in json.load(in_json)] # the only significant line to convert the JSON to the desired format
print('\n'.join(result))
{"key01": "value01", "key02": "value02", "keyN": "valueN"}
{"key01": "value01", "key02": "value02", "keyN": "valueN"}
{"key01": "value01", "key02": "value02", "keyN": "valueN"}
Run Code Online (Sandbox Code Playgroud)
*我使用的是StringIO和print这里只是做一个样本更容易在本地测试.
作为替代方案,您可以使用Python jq绑定将其与其他答案结合起来.
如果您愿意使用Python,请使用jq:
$ cat a.json
[{
"key01": "value01",
"key02": "value02",
"keyN": "valueN"
},
{
"key01": "value01",
"key02": "value02",
"keyN": "valueN"
},
{
"key01": "value01",
"key02": "value02",
"keyN": "valueN"
}
]
$ cat a.json | jq -c '.[]'
{"key01":"value01","key02":"value02","keyN":"valueN"}
{"key01":"value01","key02":"value02","keyN":"valueN"}
{"key01":"value01","key02":"value02","keyN":"valueN"}
Run Code Online (Sandbox Code Playgroud)
我使用的迭代器是'.[]'遍历数组,然后-c puts each JSON object on a single line.
资源:
这需要一个 JSON 文件并转换为 ND-JSON 文件。
import json
with open("results-20190312-113458.json", "r") as read_file:
data = json.load(read_file)
result = [json.dumps(record) for record in data]
with open('nd-proceesed.json', 'w') as obj:
for i in result:
obj.write(i+'\n')
Run Code Online (Sandbox Code Playgroud)
希望这可以帮助某人。
| 归档时间: |
|
| 查看次数: |
9063 次 |
| 最近记录: |