ata*_*ams 132 python mapping dictionary nested python-2.7
我有2个csv文件.第一个是数据文件,另一个是映射文件.映射文件中有4列:Device_Name,GDN,Device_Type,和Device_OS.数据文件中存在相同的列.
数据文件包含Device_Name填充列的数据,其他三列为空.所有四列都填充在Mapping文件中.我希望我的Python代码来打开这两个文件并为每个Device_Name数据文件,它的映射GDN,Device_Type以及Device_OS从映射文件中值.
我知道当只有2列存在时如何使用dict(需要映射1个)但我不知道如何在需要映射3列时完成此操作.
以下是我尝试完成映射的代码Device_Type:
x = dict([])
with open("Pricing Mapping_2013-04-22.csv", "rb") as in_file1:
file_map = csv.reader(in_file1, delimiter=',')
for row in file_map:
typemap = [row[0],row[2]]
x.append(typemap)
with open("Pricing_Updated_Cleaned.csv", "rb") as in_file2, open("Data Scraper_GDN.csv", "wb") as out_file:
writer = csv.writer(out_file, delimiter=',')
for row in csv.reader(in_file2, delimiter=','):
try:
row[27] = x[row[11]]
except KeyError:
row[27] = ""
writer.writerow(row)
Run Code Online (Sandbox Code Playgroud)
它返回Atribute Error.
经过一番研究,我意识到我需要创建一个嵌套的dict,但我不知道如何做到这一点.请帮我解决这个问题,或者按照正确的方向推动我解决这个问题.
Inb*_*ose 271
嵌套字典是字典中的字典.一件非常简单的事情.
>>> d = {}
>>> d['dict1'] = {}
>>> d['dict1']['innerkey'] = 'value'
>>> d
{'dict1': {'innerkey': 'value'}}
Run Code Online (Sandbox Code Playgroud)
你也可以使用一个defaultdict从collections包装,以方便创建嵌套的字典.
>>> import collections
>>> d = collections.defaultdict(dict)
>>> d['dict1']['innerkey'] = 'value'
>>> d # currently a defaultdict type
defaultdict(<type 'dict'>, {'dict1': {'innerkey': 'value'}})
>>> dict(d) # but is exactly like a normal dictionary.
{'dict1': {'innerkey': 'value'}}
Run Code Online (Sandbox Code Playgroud)
您可以根据需要填充它.
我建议在你的代码的东西像下面:
d = {} # can use defaultdict(dict) instead
for row in file_map:
# derive row key from something
# when using defaultdict, we can skip the next step creating a dictionary on row_key
d[row_key] = {}
for idx, col in enumerate(row):
d[row_key][idx] = col
Run Code Online (Sandbox Code Playgroud)
根据你的评论:
可能上面的代码混淆了这个问题.我的问题简而言之:我有2个文件a.csv b.csv,a.csv有4列ijkl,b.csv也有这些列.我是这些csvs的关键专栏.jkl列在a.csv中为空,但在b.csv中填充.我想使用'i`作为b.csv中的键列到a.csv文件映射jk l列的值
我的建议是什么像这样(不使用defaultdict):
a_file = "path/to/a.csv"
b_file = "path/to/b.csv"
# read from file a.csv
with open(a_file) as f:
# skip headers
f.next()
# get first colum as keys
keys = (line.split(',')[0] for line in f)
# create empty dictionary:
d = {}
# read from file b.csv
with open(b_file) as f:
# gather headers except first key header
headers = f.next().split(',')[1:]
# iterate lines
for line in f:
# gather the colums
cols = line.strip().split(',')
# check to make sure this key should be mapped.
if cols[0] not in keys:
continue
# add key to dict
d[cols[0]] = dict(
# inner keys are the header names, values are columns
(headers[idx], v) for idx, v in enumerate(cols[1:]))
Run Code Online (Sandbox Code Playgroud)
但请注意,对于解析csv文件,有一个csv模块.
Jun*_*hen 61
更新:对于任意长度的嵌套字典,请转到此答案.
使用集合中的defaultdict函数.
高性能:"如果密钥不在dict中",当数据集很大时非常昂贵.
低维护:使代码更易读,并且可以轻松扩展.
from collections import defaultdict
target_dict = defaultdict(dict)
target_dict[key1][key2] = val
Run Code Online (Sandbox Code Playgroud)
and*_*rew 21
对于任意级别的嵌套:
In [2]: def nested_dict():
...: return collections.defaultdict(nested_dict)
...:
In [3]: a = nested_dict()
In [4]: a
Out[4]: defaultdict(<function __main__.nested_dict>, {})
In [5]: a['a']['b']['c'] = 1
In [6]: a
Out[6]:
defaultdict(<function __main__.nested_dict>,
{'a': defaultdict(<function __main__.nested_dict>,
{'b': defaultdict(<function __main__.nested_dict>,
{'c': 1})})})
Run Code Online (Sandbox Code Playgroud)