Cas*_*ams 3 python scripting json
我有一个大文件,格式如下:
"string in quotes"
string
string
string
number
|-
Run Code Online (Sandbox Code Playgroud)
...重复一会儿。我正在尝试将其转换为JSON,所以每个块都是这样的:
"name": "string in quotes"
"description": "string"
"info": "string"
"author": "string"
"year": number
Run Code Online (Sandbox Code Playgroud)
这是我到目前为止的内容:
import shutil
import os
import urllib
myFile = open('unformatted.txt','r')
newFile = open("formatted.json", "w")
newFile.write('{'+'\n'+'list: {'+'\n')
for line in myFile:
newFile.write() // this is where I'm not sure what to write
newFile.write('}'+'\n'+'}')
myFile.close()
newFile.close()
Run Code Online (Sandbox Code Playgroud)
我想我可以对行号进行模运算,但是我不确定这是否是正确的方法。
您可以使用itertools.groupby将所有部分分组,然后json.dump将字典组合到您的json文件中:
from itertools import groupby
import json
names = ["name", "description","info","author", "year"]
with open("test.csv") as f, open("out.json","w") as out:
grouped = groupby(map(str.rstrip,f), key=lambda x: x.startswith("|-"))
for k,v in grouped:
if not k:
json.dump(dict(zip(names,v)),out)
out.write("\n")
Run Code Online (Sandbox Code Playgroud)
输入:
"string in quotes"
string
string
string
number
|-
"other string in quotes"
string2
string2
string2
number2
Run Code Online (Sandbox Code Playgroud)
输出:
{"author": "string", "name": "\"string in quotes\"", "description": "string", "info": "string", "year": "number"}
{"author": "string2", "name": "\"other string in quotes\"", "description": "string2", "info": "string2", "year": "number2"}
Run Code Online (Sandbox Code Playgroud)
要访问仅遍历文件并加载:
In [6]: with open("out.json") as out:
for line in out:
print(json.loads(line))
...:
{'name': '"string in quotes"', 'info': 'string', 'author': 'string', 'year': 'number', 'description': 'string'}
{'name': '"other string in quotes"', 'info': 'string2', 'author': 'string2', 'year': 'number2', 'description': 'string2'}
Run Code Online (Sandbox Code Playgroud)