我正在尝试将平面结构化CSV转换为嵌套的JSON结构.CSV是从SQL生成的,它为每个主ID创建多行.CSV的结构如下:
PrimaryId,FirstName,LastName,City,CarName,DogName
100,John,Smith,NewYork,Toyota,Spike
100,John,Smith,NewYork,BMW,Spike
100,John,Smith,NewYork,Toyota,Rusty
100,John,Smith,NewYork,BMW,Rusty
101,Ben,Swan,Sydney,Volkswagen,Buddy
101,Ben,Swan,Sydney,Ford,Buddy
101,Ben,Swan,Sydney,Audi,Buddy
101,Ben,Swan,Sydney,Volkswagen,Max
101,Ben,Swan,Sydney,Ford,Max
101,Ben,Swan,Sydney,Audi,Max
102,Julia,Brown,London,Mini,Lucy
Run Code Online (Sandbox Code Playgroud)
所需的JSON输出是:
{
"data": [
{
"City": "NewYork",
"FirstName": "John",
"PrimaryId": 100,
"LastName": "Smith",
"CarName": [
"Toyota",
"BMW"
],
"DogName": [
"Spike",
"Rusty"
]
},
{
"City": "Sydney",
"FirstName": "Ben",
"PrimaryId": 101,
"LastName": "Swan",
"CarName": [
"Volkswagen",
"Ford",
"Audi"
],
"DogName": [
"Buddy",
"Max"
]
},
{
"City": "London",
"FirstName": "Julia",
"PrimaryId": 102,
"LastName": "Brown",
"CarName": [
"Mini"
],
"DogName": [
"Lucy"
]
}
]
}
Run Code Online (Sandbox Code Playgroud)
以下是这样做的一般方法csv.DictReader.
首先加载数据:
import csv
import itertools
with open('stuff.csv', 'rb') as csvfile:
all_ = list(csv.DictReader(csvfile))
Run Code Online (Sandbox Code Playgroud)
现在,您可以使用itertools.groupby分组和处理每个组.例如
d = []
for k, g in itertools.groupby(
all_,
key=lambda r: (r['PrimaryId'], r[' LastName'])):
d.append({
'PrimaryId': k[0],
'LastName': k[1],
'CarName': [e[' CarName'] for e in g]
})
Run Code Online (Sandbox Code Playgroud)
将按主要ID和姓氏分组,并列出汽车列表.
一旦你有这样的东西,你可以使用json.dumps().
| 归档时间: |
|
| 查看次数: |
9193 次 |
| 最近记录: |