如何在Python中解析文本文件并转换为JSON

Cas*_*ams 3 python scripting json

我有一个大文件,格式如下:

"string in quotes"
string
string
string
number
|-
Run Code Online (Sandbox Code Playgroud)

...重复一会儿。我正在尝试将其转换为JSON,所以每个块都是这样的:

"name": "string in quotes"
"description": "string"
"info": "string"
"author": "string"
"year": number
Run Code Online (Sandbox Code Playgroud)

这是我到目前为止的内容:

import shutil
import os
import urllib

myFile = open('unformatted.txt','r')
newFile = open("formatted.json", "w")

newFile.write('{'+'\n'+'list: {'+'\n')

for line in myFile:
    newFile.write() // this is where I'm not sure what to write

newFile.write('}'+'\n'+'}')

myFile.close()
newFile.close()
Run Code Online (Sandbox Code Playgroud)

我可以对行号进行模运算,但是我不确定这是否是正确的方法。

Pad*_*ham 5

您可以使用itertools.groupby将所有部分分组,然后json.dump将字典组合到您的json文件中:

from itertools import groupby
import json
names = ["name", "description","info","author", "year"]

with open("test.csv") as f, open("out.json","w") as out:
    grouped = groupby(map(str.rstrip,f), key=lambda x: x.startswith("|-"))
    for k,v in grouped:
        if not k:
            json.dump(dict(zip(names,v)),out)
            out.write("\n")
Run Code Online (Sandbox Code Playgroud)

输入:

"string in quotes"
string
string
string
number
|-
"other string in quotes"
string2
string2
string2
number2
Run Code Online (Sandbox Code Playgroud)

输出:

{"author": "string", "name": "\"string in quotes\"", "description": "string", "info": "string", "year": "number"}
{"author": "string2", "name": "\"other string in quotes\"", "description": "string2", "info": "string2", "year": "number2"}
Run Code Online (Sandbox Code Playgroud)

要访问仅遍历文件并加载:

In [6]: with open("out.json") as out:
            for line in out:
                 print(json.loads(line))
   ...:         
{'name': '"string in quotes"', 'info': 'string', 'author': 'string', 'year': 'number', 'description': 'string'}
{'name': '"other string in quotes"', 'info': 'string2', 'author': 'string2', 'year': 'number2', 'description': 'string2'}
Run Code Online (Sandbox Code Playgroud)