如何将CSV文件转换为多线JSON?

Bea*_*ing 82 python csv json

这是我的代码,非常简单的东西......

import csv
import json

csvfile = open('file.csv', 'r')
jsonfile = open('file.json', 'w')

fieldnames = ("FirstName","LastName","IDNumber","Message")
reader = csv.DictReader( csvfile, fieldnames)
out = json.dumps( [ row for row in reader ] )
jsonfile.write(out)
Run Code Online (Sandbox Code Playgroud)

声明一些字段名称,读取器使用CSV来读取文件,并使用字段名称将文件转储为JSON格式.这是问题......

CSV文件中的每条记录都在不同的行上.我希望JSON输出方式相同.问题是它将它全部放在一条巨大的长线上.

我尝试过使用类似的东西for line in csvfile:,然后在下面运行我的代码,reader = csv.DictReader( line, fieldnames)通过每行循环,但它在一行上完成整个文件,然后在另一行循环遍历整个文件...继续直到它用完行.

有任何纠正这个的建议吗?

编辑:澄清一下,目前我有:(第1行的每条记录)

[{"FirstName":"John","LastName":"Doe","IDNumber":"123","Message":"None"},{"FirstName":"George","LastName":"Washington","IDNumber":"001","Message":"Something"}]
Run Code Online (Sandbox Code Playgroud)

我在找什么:( 2行2条记录)

{"FirstName":"John","LastName":"Doe","IDNumber":"123","Message":"None"}
{"FirstName":"George","LastName":"Washington","IDNumber":"001","Message":"Something"}
Run Code Online (Sandbox Code Playgroud)

并非每个单独的字段缩进/在单独的行上,但每个字段都在其自己的行上.

一些样本输入.

"John","Doe","001","Message1"
"George","Washington","002","Message2"
Run Code Online (Sandbox Code Playgroud)

Sin*_*ion 125

您想要的输出的问题是它不是有效的json文档,; 这是一个json文档流!

没关系,如果你需要它,但这意味着对于输出中你想要的每个文件,你将不得不打电话json.dumps.

由于您想要分隔文档的换行符不包含在这些文档中,因此您可以自己提供它.所以我们只需要将循环从调用json.dump中拉出来,并为每个写入的文档设置换行符.

import csv
import json

csvfile = open('file.csv', 'r')
jsonfile = open('file.json', 'w')

fieldnames = ("FirstName","LastName","IDNumber","Message")
reader = csv.DictReader( csvfile, fieldnames)
for row in reader:
    json.dump(row, jsonfile)
    jsonfile.write('\n')
Run Code Online (Sandbox Code Playgroud)

  • @ abhi1610:如果你期望输入中有一个标题,你应该构造`DictReader`而不给出`fieldnames`参数; 然后它将读取第一行以从文件中获取字段名. (5认同)
  • 但问题是outfile不是一个有效的json (3认同)
  • 最好为您的文件添加编码`csvfile = open('file.csv', 'r',encoding='utf-8')` 和 `jsonfile = open('file.json', 'w',编码='utf-8')` (2认同)

Nau*_*fal 14

您可以使用Pandas DataFrame实现此目的,使用以下示例:

import pandas as pd
csv_file = pd.DataFrame(pd.read_csv("path/to/file.csv", sep = ",", header = 0, index_col = False))
csv_file.to_json("/path/to/new/file.json", orient = "records", date_format = "epoch", double_precision = 10, force_ascii = True, date_unit = "ms", default_handler = None)
Run Code Online (Sandbox Code Playgroud)


Law*_*den 9

我接受了@ SingleNegationElimination的响应并将其简化为可以在管道中使用的三线程:

import csv
import json
import sys

for row in csv.DictReader(sys.stdin):
    json.dump(row, sys.stdout)
    sys.stdout.write('\n')
Run Code Online (Sandbox Code Playgroud)


小智 7

import csv
import json

file = 'csv_file_name.csv'
json_file = 'output_file_name.json'

#Read CSV File
def read_CSV(file, json_file):
    csv_rows = []
    with open(file) as csvfile:
        reader = csv.DictReader(csvfile)
        field = reader.fieldnames
        for row in reader:
            csv_rows.extend([{field[i]:row[field[i]] for i in range(len(field))}])
        convert_write_json(csv_rows, json_file)

#Convert csv data into json
def convert_write_json(data, json_file):
    with open(json_file, "w") as f:
        f.write(json.dumps(data, sort_keys=False, indent=4, separators=(',', ': '))) #for pretty
        f.write(json.dumps(data))


read_CSV(file,json_file)
Run Code Online (Sandbox Code Playgroud)

json.dumps()的文档


Sno*_*k S 6

你可以试试这个

import csvmapper

# how does the object look
mapper = csvmapper.DictMapper([ 
  [ 
     { 'name' : 'FirstName'},
     { 'name' : 'LastName' },
     { 'name' : 'IDNumber', 'type':'int' },
     { 'name' : 'Messages' }
  ]
 ])

# parser instance
parser = csvmapper.CSVParser('sample.csv', mapper)
# conversion service
converter = csvmapper.JSONConverter(parser)

print converter.doConvert(pretty=True)
Run Code Online (Sandbox Code Playgroud)

编辑:

更简单的方法

import csvmapper

fields = ('FirstName', 'LastName', 'IDNumber', 'Messages')
parser = CSVParser('sample.csv', csvmapper.FieldMapper(fields))

converter = csvmapper.JSONConverter(parser)

print converter.doConvert(pretty=True)
Run Code Online (Sandbox Code Playgroud)

  • 我认为您至少应该明确提到,您正在使用第三方模块`csvmapper`来执行此操作(以及从何处获取它),而不是内置的。 (2认同)

小智 5

我发现这是旧的,但我需要来自 SingleNegationElimination 的代码,但是我对包含非 utf-8 字符的数据有问题。这些出现在我不太关心的领域,所以我选择忽略它们。然而这需要一些努力。我是 python 新手,所以经过一些尝试和错误,我让它工作了。该代码是 SingleNegationElimination 的副本,带有 utf-8 的额外处理。我尝试使用https://docs.python.org/2.7/library/csv.html来做到这一点,但最终放弃了。下面的代码有效。

import csv, json

csvfile = open('file.csv', 'r')
jsonfile = open('file.json', 'w')

fieldnames = ("Scope","Comment","OOS Code","In RMF","Code","Status","Name","Sub Code","CAT","LOB","Description","Owner","Manager","Platform Owner")
reader = csv.DictReader(csvfile , fieldnames)

code = ''
for row in reader:
    try:
        print('+' + row['Code'])
        for key in row:
            row[key] = row[key].decode('utf-8', 'ignore').encode('utf-8')      
        json.dump(row, jsonfile)
        jsonfile.write('\n')
    except:
        print('-' + row['Code'])
        raise
Run Code Online (Sandbox Code Playgroud)