需要更有效的方法来解析Python中的csv文件

t0x*_*x13 2 python csv

这是一个示例csv文件

id, serial_no
2, 500
2, 501
2, 502
3, 600
3, 601
Run Code Online (Sandbox Code Playgroud)

这是我正在寻找的输出(带有id列表的serial_no列表):

[2, [500,501,502]]
[3, [600, 601]]
Run Code Online (Sandbox Code Playgroud)

我已经实现了我的解决方案,但代码太多了,我确信有更好的解决方案.还在学习Python,我还不知道所有的技巧.

file = 'test.csv'

data = csv.reader(open(file))
fields = data.next()

for row in data:
  each_row = []     
    each_row.append(row[0])
    each_row.append(row[1])
    zipped_data.append(each_row)
for rec in zipped_data:
  if rec[0] not in ids:
    ids.append(rec[0])
for id in ids:
    for rec in zipped_data:
      if rec[0] == id:
        ser_no.append(rec[1])
  tmp.append(id)
  tmp.append(ser_no)
  print tmp
  tmp = []
  ser_no = []
Run Code Online (Sandbox Code Playgroud)

**为了简化代码,我省略了var初始化

print tmp
Run Code Online (Sandbox Code Playgroud)

给我上面提到的输出.我知道有更好的方法来做这个或pythonic方式来做到这一点.太乱了!任何建议都会很棒!

Phi*_*ham 12

from collections import defaultdict

records = defaultdict(list)

file = 'test.csv'

data = csv.reader(open(file))
fields = data.next()

for row in data:
    records[row[0]].append(row[1])

#sorting by ids since keys don't maintain order
results = sorted(records.items(), key=lambda x: x[0])
print results
Run Code Online (Sandbox Code Playgroud)

如果serial_nos清单必须是唯一的只需更换defaultdict(list)defaultdict(set)records[row[0]].append(row[1])records[row[0]].add(row[1])


Ign*_*ams 5

而不是列表,我会使它成为一个collections.defaultdict(list),然后只是append()在值上调用方法.

result = collections.defaultdict(list)
for row in data:
  result[row[0]].append(row[1])
Run Code Online (Sandbox Code Playgroud)