从拼合字典创建嵌套字典

Tho*_*las 50 python recursion dictionary nested netcdf

我有一个扁平的字典,我想把它变成一个嵌套的字典

flat = {'X_a_one': 10,
        'X_a_two': 20, 
        'X_b_one': 10,
        'X_b_two': 20, 
        'Y_a_one': 10,
        'Y_a_two': 20,
        'Y_b_one': 10,
        'Y_b_two': 20}
Run Code Online (Sandbox Code Playgroud)

我想将其转换为表单

nested = {'X': {'a': {'one': 10,
                      'two': 20}, 
                'b': {'one': 10,
                      'two': 20}}, 
          'Y': {'a': {'one': 10,
                      'two': 20},
                'b': {'one': 10,
                      'two': 20}}}
Run Code Online (Sandbox Code Playgroud)

扁平字典的结构使得模糊不应存在任何问题.我希望它适用于任意深度的字典,但性能并不是真正的问题.我已经看到很多用于展平嵌套字典的方法,但基本上没有用于嵌套扁平字典的方法.存储在字典中的值是标量或字符串,永远不会迭代.

到目前为止,我有一些可以接受输入的东西

test_dict = {'X_a_one': '10',
             'X_b_one': '10',
             'X_c_one': '10'}
Run Code Online (Sandbox Code Playgroud)

到输出

test_out = {'X': {'a_one': '10', 
                  'b_one': '10', 
                  'c_one': '10'}}
Run Code Online (Sandbox Code Playgroud)

使用代码

def nest_once(inp_dict):
    out = {}
    if isinstance(inp_dict, dict):
        for key, val in inp_dict.items():
            if '_' in key:
                head, tail = key.split('_', 1)

                if head not in out.keys():
                    out[head] = {tail: val}
                else:
                    out[head].update({tail: val})
            else:
                out[key] = val
    return out

test_out = nest_once(test_dict)
Run Code Online (Sandbox Code Playgroud)

但是我无法弄清楚如何将它变成递归创建字典所有级别的东西.

任何帮助,将不胜感激!

(至于为什么我要这样做:我有一个文件,其结构相当于嵌套的dict,我想将这个文件的内容存储在NetCDF文件的属性字典中并稍后检索它.但是NetCDF只允许你把平面词典作为属性,所以我想取消先前存储在NetCDF文件中的字典.)

jde*_*esa 26

这是我的看法:

def nest_dict(flat):
    result = {}
    for k, v in flat.items():
        _nest_dict_rec(k, v, result)
    return result

def _nest_dict_rec(k, v, out):
    k, *rest = k.split('_', 1)
    if rest:
        _nest_dict_rec(rest[0], v, out.setdefault(k, {}))
    else:
        out[k] = v

flat = {'X_a_one': 10,
        'X_a_two': 20, 
        'X_b_one': 10,
        'X_b_two': 20, 
        'Y_a_one': 10,
        'Y_a_two': 20,
        'Y_b_one': 10,
        'Y_b_two': 20}
nested = {'X': {'a': {'one': 10,
                      'two': 20}, 
                'b': {'one': 10,
                      'two': 20}}, 
          'Y': {'a': {'one': 10,
                      'two': 20},
                'b': {'one': 10,
                      'two': 20}}}
print(nest_dict(flat) == nested)
# True
Run Code Online (Sandbox Code Playgroud)


cwa*_*ole 24

output = {}

for k, v in source.items():
    # always start at the root.
    current = output

    # This is the part you're struggling with.
    pieces = k.split('_')

    # iterate from the beginning until the second to last place
    for piece in pieces[:-1]:
       if not piece in current:
          # if a dict doesn't exist at an index, then create one
          current[piece] = {}

       # as you walk into the structure, update your current location
       current = current[piece]

    # The reason you're using the second to last is because the last place
    # represents the place you're actually storing the item
    current[pieces[-1]] = v
Run Code Online (Sandbox Code Playgroud)

  • 在我看来,稍微更具可读性的是在线解包:`*initial_keys,final_key = k.split('_')`.但很好的答案! (3认同)

jpp*_*jpp 13

这是一种使用方式collections.defaultdict,大量借用之前的答案.有3个步骤:

  1. 创建嵌套defaultdictdefaultdict对象.
  2. flat输入字典中迭代项目.
  3. defaultdict根据从拆分键派生的结构构建结果_,使用getFromDict迭代结果字典.

这是一个完整的例子:

from collections import defaultdict
from functools import reduce
from operator import getitem

def getFromDict(dataDict, mapList):
    """Iterate nested dictionary"""
    return reduce(getitem, mapList, dataDict)

# instantiate nested defaultdict of defaultdicts
tree = lambda: defaultdict(tree)
d = tree()

# iterate input dictionary
for k, v in flat.items():
    *keys, final_key = k.split('_')
    getFromDict(d, keys)[final_key] = v

{'X': {'a': {'one': 10, 'two': 20}, 'b': {'one': 10, 'two': 20}},
 'Y': {'a': {'one': 10, 'two': 20}, 'b': {'one': 10, 'two': 20}}}
Run Code Online (Sandbox Code Playgroud)

作为最后一步,您可以将您转换defaultdict为常规dict,但通常这一步不是必需的.

def default_to_regular_dict(d):
    """Convert nested defaultdict to regular dict of dicts."""
    if isinstance(d, defaultdict):
        d = {k: default_to_regular_dict(v) for k, v in d.items()}
    return d

# convert back to regular dict
res = default_to_regular_dict(d)
Run Code Online (Sandbox Code Playgroud)

  • @DavidFoerster,这将`k.split('_')`生成的列表解压缩成一个列表和一个字符串,其中字符串是最后的分割.它消除了以后对位置索引的需要. (2认同)