CSV到词典列表 - 更好的方式？

Question

CSV到词典列表 - 更好的方式？

我正在开发一个函数,它接受一个CSV的文件名并将每一行转换为一个字典,然后返回一个创建的字典列表(以便能够迭代并在以后的函数中组织.我已经得到它通过以下方式做我想做的事,但觉得必须有更好的方法.有任何改进的建议吗？

import re

def import_incidents(filename):
    """Imports CSV and returns list of dictionaries for each incident"""
    with open(filename, 'r') as file:
        data = file.read()
        data = data.split('\n')
        list_of_data = []
        headers = True
        for line in data:
            line = line.split('","')
            if headers == True:
                #Skip header and set to false
                headers = False
            elif len(line) == 1 or line[3] == '':
                #File always has a 1 lenth final line, skip it.
                #Events can leave blank policies, skip those too.
                pass
            else:
                temp_dict = {}
                temp_dict['id'] = re.sub('"', '', line[0])
                temp_dict['time'] = re.sub('GMT-0600','',line[1])
                temp_dict['source'] = line[2]
                temp_dict['policy'] = line[3]
                temp_dict['destination'] = line[5]
                temp_dict['status'] = line[10]
                list_of_data.append(temp_dict)

return list_of_data

print(import_incidents('Incidents (Yesterday Only).csv'))

Run Code Online (Sandbox Code Playgroud)

CSV内容示例:

"ID","Incident Time","Source","Policies","Channel","Destination","Severity","Action","Maximum Matches","Transaction Size","Status",
"9511564","29 Dec. 2015, 08:33:59 AM GMT-0600","Doe, John","Encrypted files","HTTPS","blah.blah.com","Medium","Permitted","0","47.7 KB","Closed - Authorized",
"1848446","29 Dec. 2015, 08:23:36 AM GMT-0600","Smith, Joe","","HTTP","google.com","Low","Permitted","0","775 B","Closed"

Run Code Online (Sandbox Code Playgroud)

Answer 1

Mar*_*ers 6

你重新改造了csv.DictReader()课程,我担心:

import csv

def import_incidents(filename):
    with open(filename, 'r', newline='') as file:
        reader = csv.DictReader(file)
        for row in reader:
            if not row or not row['Policies']:
                continue
            row['Incident Time'] = re.sub('GMT-0600', '', row['Incident Time'])
            yield row

Run Code Online (Sandbox Code Playgroud)

这依赖于字典键的标题行.您可以使用fieldnames参数to 定义自己的字典键DictReader()(fieldnames字段按顺序与文件中的列匹配),但文件中的第一行仍然像任何其他行一样被读取.您可以使用该next()函数跳过行(请参阅使用Python编辑csv文件时跳过标题).

归档时间：	10 年，2 月前
查看次数：	2573 次
最近记录：	10 年前