假设我有一个csv.DictReader对象,我想把它写成CSV文件.我怎样才能做到这一点?
我知道我可以像这样编写数据行:
dr = csv.DictReader(open(f), delimiter='\t')
# process my dr object
# ...
# write out object
output = csv.DictWriter(open(f2, 'w'), delimiter='\t')
for item in dr:
output.writerow(item)
Run Code Online (Sandbox Code Playgroud)
但是我如何包含字段名?
ber*_*nie 139
编辑:
在2.7/3.2中有一种新writeheader()方法.此外,John Machin的答案提供了一种编写标题行的简单方法.
使用writeheader()2.7/3.2中现有方法的简单示例:
from collections import OrderedDict
ordered_fieldnames = OrderedDict([('field1',None),('field2',None)])
with open(outfile,'wb') as fou:
dw = csv.DictWriter(fou, delimiter='\t', fieldnames=ordered_fieldnames)
dw.writeheader()
# continue on to write data
Run Code Online (Sandbox Code Playgroud)
实例化DictWriter需要一个fieldnames参数.
从文档:
fieldnames参数标识传递给writerow()方法的字典中的值被写入csvfile的顺序.
换句话说:Fieldnames参数是必需的,因为Python dicts本质上是无序的.
下面是如何将标头和数据写入文件的示例.
注意:with声明在2.6中添加.如果使用2.5:from __future__ import with_statement
with open(infile,'rb') as fin:
dr = csv.DictReader(fin, delimiter='\t')
# dr.fieldnames contains values from first row of `f`.
with open(outfile,'wb') as fou:
dw = csv.DictWriter(fou, delimiter='\t', fieldnames=dr.fieldnames)
headers = {}
for n in dw.fieldnames:
headers[n] = n
dw.writerow(headers)
for row in dr:
dw.writerow(row)
Run Code Online (Sandbox Code Playgroud)
正如@FM在评论中提到的那样,你可以将标题写入浓缩为单行,例如:
with open(outfile,'wb') as fou:
dw = csv.DictWriter(fou, delimiter='\t', fieldnames=dr.fieldnames)
dw.writerow(dict((fn,fn) for fn in dr.fieldnames))
for row in dr:
dw.writerow(row)
Run Code Online (Sandbox Code Playgroud)
Joh*_*hin 28
一些选择:
(1)费力地从你的字段名中做出一个标识映射(即do-nothing)dict,这样csv.DictWriter就可以将它转换回一个列表并将它传递给csv.writer实例.
(2)文档提到"基础writer实例"......所以只需使用它(最后的例子).
dw.writer.writerow(dw.fieldnames)
Run Code Online (Sandbox Code Playgroud)
(3)避免csv.Dictwriter开销并使用csv.writer自行完成
写数据:
w.writerow([d[k] for k in fieldnames])
Run Code Online (Sandbox Code Playgroud)
要么
w.writerow([d.get(k, restval) for k in fieldnames])
Run Code Online (Sandbox Code Playgroud)
而不是extrasaction"功能",我宁愿自己编码; 这样你可以用键和值报告所有"额外",而不仅仅是第一个额外的键.DictWriter真正令人讨厌的是,如果你在构建每个dict时自己验证了密钥,你需要记住使用extrasaction ='ignore'否则它会慢慢(fieldnames是一个列表)重复检查:
wrong_fields = [k for k in rowdict if k not in self.fieldnames]
Run Code Online (Sandbox Code Playgroud)
============
>>> f = open('csvtest.csv', 'wb')
>>> import csv
>>> fns = 'foo bar zot'.split()
>>> dw = csv.DictWriter(f, fns, restval='Huh?')
# dw.writefieldnames(fns) -- no such animal
>>> dw.writerow(fns) # no such luck, it can't imagine what to do with a list
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\python26\lib\csv.py", line 144, in writerow
return self.writer.writerow(self._dict_to_list(rowdict))
File "C:\python26\lib\csv.py", line 141, in _dict_to_list
return [rowdict.get(key, self.restval) for key in self.fieldnames]
AttributeError: 'list' object has no attribute 'get'
>>> dir(dw)
['__doc__', '__init__', '__module__', '_dict_to_list', 'extrasaction', 'fieldnam
es', 'restval', 'writer', 'writerow', 'writerows']
# eureka
>>> dw.writer.writerow(dw.fieldnames)
>>> dw.writerow({'foo':'oof'})
>>> f.close()
>>> open('csvtest.csv', 'rb').read()
'foo,bar,zot\r\noof,Huh?,Huh?\r\n'
>>>
Run Code Online (Sandbox Code Playgroud)
另一种方法是在输出中添加行之前添加,如下所示:
output.writerow(dict(zip(dr.fieldnames, dr.fieldnames)))
Run Code Online (Sandbox Code Playgroud)
zip将返回包含相同值的doublet列表.此列表可用于启动字典.
| 归档时间: |
|
| 查看次数: |
130460 次 |
| 最近记录: |