csv嵌套JSON?

Dav*_*vid 1 python csv json

我正在尝试将平面结构化CSV转换为嵌套的JSON结构.CSV是从SQL生成的,它为每个主ID创建多行.CSV的结构如下:

PrimaryId,FirstName,LastName,City,CarName,DogName
100,John,Smith,NewYork,Toyota,Spike
100,John,Smith,NewYork,BMW,Spike
100,John,Smith,NewYork,Toyota,Rusty
100,John,Smith,NewYork,BMW,Rusty
101,Ben,Swan,Sydney,Volkswagen,Buddy
101,Ben,Swan,Sydney,Ford,Buddy
101,Ben,Swan,Sydney,Audi,Buddy
101,Ben,Swan,Sydney,Volkswagen,Max
101,Ben,Swan,Sydney,Ford,Max
101,Ben,Swan,Sydney,Audi,Max
102,Julia,Brown,London,Mini,Lucy
Run Code Online (Sandbox Code Playgroud)

所需的JSON输出是:

{
    "data": [
        {
            "City": "NewYork", 
            "FirstName": "John", 
            "PrimaryId": 100, 
            "LastName": "Smith", 
            "CarName": [
                "Toyota", 
                "BMW"
            ], 
            "DogName": [
                "Spike", 
                "Rusty"
            ]
        }, 
        {
            "City": "Sydney", 
            "FirstName": "Ben", 
            "PrimaryId": 101, 
            "LastName": "Swan", 
            "CarName": [
                "Volkswagen", 
                "Ford", 
                "Audi"
            ], 
            "DogName": [
                "Buddy", 
                "Max"
            ]
        }, 
        {
            "City": "London", 
            "FirstName": "Julia", 
            "PrimaryId": 102, 
            "LastName": "Brown", 
            "CarName": [
                "Mini"
            ], 
            "DogName": [
                "Lucy"
            ]
        }
    ]
}
Run Code Online (Sandbox Code Playgroud)

无论这篇文章这一个帮助,但我还没有建立正确的结构.

Ami*_*ory 5

以下是这样做的一般方法csv.DictReader.

首先加载数据:

import csv
import itertools
with open('stuff.csv', 'rb') as csvfile:
    all_ = list(csv.DictReader(csvfile))
Run Code Online (Sandbox Code Playgroud)

现在,您可以使用itertools.groupby分组和处理每个组.例如

d = []
for k, g in itertools.groupby(
        all_, 
        key=lambda r: (r['PrimaryId'], r[' LastName'])):
    d.append({
        'PrimaryId': k[0],
        'LastName': k[1],
        'CarName': [e[' CarName'] for e in g]
        })
Run Code Online (Sandbox Code Playgroud)

将按主要ID和姓氏分组,并列出汽车列表.

一旦你有这样的东西,你可以使用json.dumps().