如何将具有数组数据的csv文件转换为json文件?

Cod*_*ein 2 python csv arrays json

我有一个包含大量列的CSV文件.一些列是相同的,但我想将它们转换为JSON对象,它们都位于同一个数组下.

例如在CSV中:

firstname,lastname,pet,pet,pet
Joe, Dimaggio, dog, cat
Pete, Rose, turtle, cat
Jackie, Robinson, dog
Run Code Online (Sandbox Code Playgroud)

我想要JSON

{ firstname: Joe,
  lastname: Dimaggio,
  pets: ["dog", "cat"]
},
{ firstname: Pete,
  lastname: Rose,
  pets: ["turtle", "cat"]
},
{ firstname: Jackie,
  lastname: Robinson,
  pets: ["dog"]
}
Run Code Online (Sandbox Code Playgroud)

我正在尝试编写一个简单的Python脚本来执行此操作,但我遇到了问题.

这是我到目前为止所得到的:

import csv
import json

csvfile = open('userdata.csv', 'r')
jsonfile = open('userdata.json', 'w')

fieldnames = ("firstname", "lastname", "pet", "pet", "pet");
reader = csv.DictReader( csvfile, fieldnames)
record = {}
for row in reader:
    record['firstname'] = row['firstname']
    record['lastname'] = row['lastname']
    record['pets'] = json.JSONEncoder().encode({"pets": [row['pet'], row['pet'], row['pet'], row['pet'], row['pet']]});
    json.dump(record, jsonfile, indent=4)
    ##json.dump(json.loads(json.JSONEncoder(record)), jsonfile, indent=4)
print "something worked"
Run Code Online (Sandbox Code Playgroud)

但这很有趣,因为它pets在一个叫做对象的内部打印成一个数组pets.

我无法弄清楚如何pets在对象`宠物之外获取数组.它还在数组项中添加反斜杠

{
    "firstname": "Joe",
    "lastname": "Dimaggio", 
    "pets": "{\"pets\": [\"dog\", \"cat\"]}"
}
Run Code Online (Sandbox Code Playgroud)

Bro*_*bin 6

这是因为你正在编码然后使用json.dumps它基本上编码两次.删除json.JSONEncoder().encode(...)它应该正常工作.

import csv
import json

csvfile = open('userdata.csv', 'r')
jsonfile = open('userdata.json', 'w')

fieldnames = ("firstname", "lastname", "pet", "pet", "pet");
reader = csv.DictReader( csvfile, fieldnames)
record = {}
for row in reader:
    record['firstname'] = row['firstname']
    record['lastname'] = row['lastname']
    record['pets'] = [[row['pet'], row['pet'], row['pet'], row['pet'], row['pet']]
    # Remove blank entries
    record['pets'] = [x for x in record['pets'] if x is not '']
    json.dumps(record, jsonfile, indent=4)
print "something worked"
Run Code Online (Sandbox Code Playgroud)

你看到的反斜杠来自于转义json字符串,这是将它序列化两次的结果.