fdh*_*hex 108 python merge dictionary array-merge
我需要合并多个词典,这是我的例子:
dict1 = {1:{"a":{A}}, 2:{"b":{B}}}
dict2 = {2:{"c":{C}}, 3:{"d":{D}}
Run Code Online (Sandbox Code Playgroud)
随着A B C与D树的叶,像{"info1":"value", "info2":"value2"}
字典有一个未知的级别(深度),它可能是 {2:{"c":{"z":{"y":{C}}}}}
在我的例子中,它代表一个目录/文件结构,其中节点是docs并且是文件.
我想合并它们以获得:
dict3 = {1:{"a":{A}}, 2:{"b":{B},"c":{C}}, 3:{"d":{D}}}
Run Code Online (Sandbox Code Playgroud)
我不确定如何使用Python轻松完成这项工作.
and*_*oke 124
这实际上非常棘手 - 特别是如果你想要一个有用的错误信息,当事情不一致,同时正确接受重复但一致的条目(这里没有其他答案....)
假设您没有大量条目,则递归函数最简单:
def merge(a, b, path=None):
"merges b into a"
if path is None: path = []
for key in b:
if key in a:
if isinstance(a[key], dict) and isinstance(b[key], dict):
merge(a[key], b[key], path + [str(key)])
elif a[key] == b[key]:
pass # same leaf value
else:
raise Exception('Conflict at %s' % '.'.join(path + [str(key)]))
else:
a[key] = b[key]
return a
# works
print(merge({1:{"a":"A"},2:{"b":"B"}}, {2:{"c":"C"},3:{"d":"D"}}))
# has conflict
merge({1:{"a":"A"},2:{"b":"B"}}, {1:{"a":"A"},2:{"b":"C"}})
Run Code Online (Sandbox Code Playgroud)
请注意,这会发生变化a- b添加的内容a(也会返回).如果你想保持a你可以称之为merge(dict(a), b).
agf指出(下面)你可能有两个以上的dicts,在这种情况下你可以使用:
reduce(merge, [dict1, dict2, dict3...])
Run Code Online (Sandbox Code Playgroud)
将所有内容添加到dict1.
[注意 - 我编辑了我最初的答案来改变第一个论点; 这使"减少"更容易解释]
ps在python 3中,你也需要 from functools import reduce
jte*_*ace 27
这是使用生成器执行此操作的简单方法:
def mergedicts(dict1, dict2):
for k in set(dict1.keys()).union(dict2.keys()):
if k in dict1 and k in dict2:
if isinstance(dict1[k], dict) and isinstance(dict2[k], dict):
yield (k, dict(mergedicts(dict1[k], dict2[k])))
else:
# If one of the values is not a dict, you can't continue merging it.
# Value from second dict overrides one in first and we move on.
yield (k, dict2[k])
# Alternatively, replace this with exception raiser to alert you of value conflicts
elif k in dict1:
yield (k, dict1[k])
else:
yield (k, dict2[k])
dict1 = {1:{"a":"A"},2:{"b":"B"}}
dict2 = {2:{"c":"C"},3:{"d":"D"}}
print dict(mergedicts(dict1,dict2))
Run Code Online (Sandbox Code Playgroud)
这打印:
{1: {'a': 'A'}, 2: {'c': 'C', 'b': 'B'}, 3: {'d': 'D'}}
Run Code Online (Sandbox Code Playgroud)
Tra*_*rke 26
你可以试试mergeeep。
安装
$ pip3 install mergedeep
Run Code Online (Sandbox Code Playgroud)
用法
from mergedeep import merge
a = {"keyA": 1}
b = {"keyB": {"sub1": 10}}
c = {"keyB": {"sub2": 20}}
merge(a, b, c)
print(a)
# {"keyA": 1, "keyB": {"sub1": 10, "sub2": 20}}
Run Code Online (Sandbox Code Playgroud)
有关选项的完整列表,请查看文档!
小智 20
这个问题的一个问题是dict的值可以是任意复杂的数据.基于这些和其他答案,我想出了这段代码:
class YamlReaderError(Exception):
pass
def data_merge(a, b):
"""merges b into a and return merged result
NOTE: tuples and arbitrary objects are not handled as it is totally ambiguous what should happen"""
key = None
# ## debug output
# sys.stderr.write("DEBUG: %s to %s\n" %(b,a))
try:
if a is None or isinstance(a, str) or isinstance(a, unicode) or isinstance(a, int) or isinstance(a, long) or isinstance(a, float):
# border case for first run or if a is a primitive
a = b
elif isinstance(a, list):
# lists can be only appended
if isinstance(b, list):
# merge lists
a.extend(b)
else:
# append to list
a.append(b)
elif isinstance(a, dict):
# dicts must be merged
if isinstance(b, dict):
for key in b:
if key in a:
a[key] = data_merge(a[key], b[key])
else:
a[key] = b[key]
else:
raise YamlReaderError('Cannot merge non-dict "%s" into dict "%s"' % (b, a))
else:
raise YamlReaderError('NOT IMPLEMENTED "%s" into "%s"' % (b, a))
except TypeError, e:
raise YamlReaderError('TypeError "%s" in key "%s" when merging "%s" into "%s"' % (e, key, b, a))
return a
Run Code Online (Sandbox Code Playgroud)
我的用例是合并YAML文件,我只需要处理可能数据类型的子集.因此我可以忽略元组和其他对象.对我来说,合理的合并逻辑意味着
其他一切和不可预见的事都会导致错误.
Aar*_*all 12
字典词典合并
由于这是规范问题(尽管存在某些非一般性),但我提供了规范的Pythonic方法来解决这个问题.
d1 = {'a': {1: {'foo': {}}, 2: {}}}
d2 = {'a': {1: {}, 2: {'bar': {}}}}
d3 = {'b': {3: {'baz': {}}}}
d4 = {'a': {1: {'quux': {}}}}
Run Code Online (Sandbox Code Playgroud)
这是递归的最简单的情况,我建议两种天真的方法:
def rec_merge1(d1, d2):
'''return new merged dict of dicts'''
for k, v in d1.items(): # in Python 2, use .iteritems()!
if k in d2:
d2[k] = rec_merge1(v, d2[k])
d3 = d1.copy()
d3.update(d2)
return d3
def rec_merge2(d1, d2):
'''update first dict with second recursively'''
for k, v in d1.items(): # in Python 2, use .iteritems()!
if k in d2:
d2[k] = rec_merge2(v, d2[k])
d1.update(d2)
return d1
Run Code Online (Sandbox Code Playgroud)
我相信我更喜欢第二个到第一个,但请记住,第一个的原始状态必须从其原点重建.这是用法:
>>> from functools import reduce # only required for Python 3.
>>> reduce(rec_merge1, (d1, d2, d3, d4))
{'a': {1: {'quux': {}, 'foo': {}}, 2: {'bar': {}}}, 'b': {3: {'baz': {}}}}
>>> reduce(rec_merge2, (d1, d2, d3, d4))
{'a': {1: {'quux': {}, 'foo': {}}, 2: {'bar': {}}}, 'b': {3: {'baz': {}}}}
Run Code Online (Sandbox Code Playgroud)
因此,如果它们以结尾语结束,那么合并最终空的dicts就是一个简单的例子.如果没有,那不是那么微不足道.如果是字符串,你如何合并它们?集可以类似地更新,因此我们可以给予该处理,但是我们失去了它们被合并的顺序.订单也很重要吗?
因此,代替更多信息,最简单的方法是在两个值都不是dicts的情况下为它们提供标准更新处理:即第二个dict的值将覆盖第一个,即使第二个dict的值为None且第一个值为a dict有很多信息.
d1 = {'a': {1: 'foo', 2: None}}
d2 = {'a': {1: None, 2: 'bar'}}
d3 = {'b': {3: 'baz'}}
d4 = {'a': {1: 'quux'}}
from collections import MutableMapping
def rec_merge(d1, d2):
'''
Update two dicts of dicts recursively,
if either mapping has leaves that are non-dicts,
the second's leaf overwrites the first's.
'''
for k, v in d1.items(): # in Python 2, use .iteritems()!
if k in d2:
# this next check is the only difference!
if all(isinstance(e, MutableMapping) for e in (v, d2[k])):
d2[k] = rec_merge(v, d2[k])
# we could further check types and merge as appropriate here.
d3 = d1.copy()
d3.update(d2)
return d3
Run Code Online (Sandbox Code Playgroud)
现在
from functools import reduce
reduce(rec_merge, (d1, d2, d3, d4))
Run Code Online (Sandbox Code Playgroud)
回报
{'a': {1: 'quux', 2: 'bar'}, 'b': {3: 'baz'}}
Run Code Online (Sandbox Code Playgroud)
我不得不删除字母周围的花括号,并将它们放在单引号中,这是合法的Python(否则它们将在Python 2.7+中设置文字)以及附加缺少的括号:
dict1 = {1:{"a":'A'}, 2:{"b":'B'}}
dict2 = {2:{"c":'C'}, 3:{"d":'D'}}
Run Code Online (Sandbox Code Playgroud)
而rec_merge(dict1, dict2)现在返回:
{1: {'a': 'A'}, 2: {'c': 'C', 'b': 'B'}, 3: {'d': 'D'}}
Run Code Online (Sandbox Code Playgroud)
哪个匹配原始问题的期望结果(更改后,例如{A}to 'A'.)
小智 9
基于@andrew cooke.此版本处理嵌套的dicts列表,并允许选项更新值
def merge(a, b, path=None, update=True):
"http://stackoverflow.com/questions/7204805/python-dictionaries-of-dictionaries-merge"
"merges b into a"
if path is None: path = []
for key in b:
if key in a:
if isinstance(a[key], dict) and isinstance(b[key], dict):
merge(a[key], b[key], path + [str(key)])
elif a[key] == b[key]:
pass # same leaf value
elif isinstance(a[key], list) and isinstance(b[key], list):
for idx, val in enumerate(b[key]):
a[key][idx] = merge(a[key][idx], b[key][idx], path + [str(key), str(idx)], update=update)
elif update:
a[key] = b[key]
else:
raise Exception('Conflict at %s' % '.'.join(path + [str(key)]))
else:
a[key] = b[key]
return a
简短而甜蜜:
from collections.abc import MutableMapping as Map
def nested_update(d, v):
"""
Nested update of dict-like 'd' with dict-like 'v'.
"""
for key in v:
if key in d and isinstance(d[key], Map) and isinstance(v[key], Map):
nested_update(d[key], v[key])
else:
d[key] = v[key]
Run Code Online (Sandbox Code Playgroud)
这与 Python 的方法类似(并且构建于 Python 的dict.update方法之上)。它会返回None(如果您愿意,您可以随时添加),因为它会就地return d更新字典。d输入的键v将覆盖任何现有的键d(它不会尝试解释字典的内容)。
它也适用于其他(“类似字典”)映射。
例子:
people = {'pete': {'gender': 'male'}, 'mary': {'age': 34}}
nested_update(people, {'pete': {'age': 41}})
# Pete's age was merged in
print(people)
{'pete': {'gender': 'male', 'age': 41}, 'mary': {'age': 34}}
Run Code Online (Sandbox Code Playgroud)
Python 的常规dict.update方法产生:
people = {'pete': {'gender': 'male'}, 'mary': {'age': 34}}
people.update({'pete': {'age': 41}})
# We lost Pete's gender here!
print(people)
{'pete': {'age': 41}, 'mary': {'age': 34}}
Run Code Online (Sandbox Code Playgroud)
如果你有一个未知级别的词典,那么我会建议一个递归函数:
def combineDicts(dictionary1, dictionary2):
output = {}
for item, value in dictionary1.iteritems():
if dictionary2.has_key(item):
if isinstance(dictionary2[item], dict):
output[item] = combineDicts(value, dictionary2.pop(item))
else:
output[item] = value
for item, value in dictionary2.iteritems():
output[item] = value
return output
Run Code Online (Sandbox Code Playgroud)
这个简单的递归过程将一个字典合并到另一个字典中,同时覆盖冲突的键:
#!/usr/bin/env python2.7
def merge_dicts(dict1, dict2):
""" Recursively merges dict2 into dict1 """
if not isinstance(dict1, dict) or not isinstance(dict2, dict):
return dict2
for k in dict2:
if k in dict1:
dict1[k] = merge_dicts(dict1[k], dict2[k])
else:
dict1[k] = dict2[k]
return dict1
print (merge_dicts({1:{"a":"A"}, 2:{"b":"B"}}, {2:{"c":"C"}, 3:{"d":"D"}}))
print (merge_dicts({1:{"a":"A"}, 2:{"b":"B"}}, {1:{"a":"A"}, 2:{"b":"C"}}))
Run Code Online (Sandbox Code Playgroud)
输出:
{1: {'a': 'A'}, 2: {'c': 'C', 'b': 'B'}, 3: {'d': 'D'}}
{1: {'a': 'A'}, 2: {'b': 'C'}}
Run Code Online (Sandbox Code Playgroud)
概述
以下方法将字典深度合并的问题细分为:
一个参数化的浅合并函数merge(f)(a,b),它使用一个函数f来合并两个字典a和b
递归合并函数f与merge
执行
合并两个(非嵌套)字典的函数可以用多种方式编写。我个人喜欢
def merge(f):
def merge(a,b):
keys = a.keys() | b.keys()
return {key:f(a.get(key), b.get(key)) for key in keys}
return merge
Run Code Online (Sandbox Code Playgroud)
定义适当的递归合并函数的一个好方法f是使用multipledispatch,它允许定义根据参数类型沿不同路径计算的函数。
from multipledispatch import dispatch
#for anything that is not a dict return
@dispatch(object, object)
def f(a, b):
return b if b is not None else a
#for dicts recurse
@dispatch(dict, dict)
def f(a,b):
return merge(f)(a,b)
Run Code Online (Sandbox Code Playgroud)
例子
要合并两个嵌套的字典,只需使用merge(f)例如:
dict1 = {1:{"a":"A"},2:{"b":"B"}}
dict2 = {2:{"c":"C"},3:{"d":"D"}}
merge(f)(dict1, dict2)
#returns {1: {'a': 'A'}, 2: {'b': 'B', 'c': 'C'}, 3: {'d': 'D'}}
Run Code Online (Sandbox Code Playgroud)
笔记:
这种方法的优点是:
该函数由较小的函数构建而成,每个函数只做一件事情,这使得代码更易于推理和测试
该行为不是硬编码的,但可以根据需要更改和扩展,以提高代码重用性(请参见下面的示例)。
定制
一些答案还考虑了包含列表的字典,例如其他(可能嵌套的)字典。在这种情况下,人们可能想要映射列表并根据位置合并它们。这可以通过向合并函数添加另一个定义来完成f:
import itertools
@dispatch(list, list)
def f(a,b):
return [merge(f)(*arg) for arg in itertools.zip_longest(a, b)]
Run Code Online (Sandbox Code Playgroud)
小智 5
基于@andrew cooke的答案。它以更好的方式处理嵌套列表。
def deep_merge_lists(original, incoming):
"""
Deep merge two lists. Modifies original.
Recursively call deep merge on each correlated element of list.
If item type in both elements are
a. dict: Call deep_merge_dicts on both values.
b. list: Recursively call deep_merge_lists on both values.
c. any other type: Value is overridden.
d. conflicting types: Value is overridden.
If length of incoming list is more that of original then extra values are appended.
"""
common_length = min(len(original), len(incoming))
for idx in range(common_length):
if isinstance(original[idx], dict) and isinstance(incoming[idx], dict):
deep_merge_dicts(original[idx], incoming[idx])
elif isinstance(original[idx], list) and isinstance(incoming[idx], list):
deep_merge_lists(original[idx], incoming[idx])
else:
original[idx] = incoming[idx]
for idx in range(common_length, len(incoming)):
original.append(incoming[idx])
def deep_merge_dicts(original, incoming):
"""
Deep merge two dictionaries. Modifies original.
For key conflicts if both values are:
a. dict: Recursively call deep_merge_dicts on both values.
b. list: Call deep_merge_lists on both values.
c. any other type: Value is overridden.
d. conflicting types: Value is overridden.
"""
for key in incoming:
if key in original:
if isinstance(original[key], dict) and isinstance(incoming[key], dict):
deep_merge_dicts(original[key], incoming[key])
elif isinstance(original[key], list) and isinstance(incoming[key], list):
deep_merge_lists(original[key], incoming[key])
else:
original[key] = incoming[key]
else:
original[key] = incoming[key]
Run Code Online (Sandbox Code Playgroud)
如果有人想要另一种方法来解决这个问题,这是我的解决方案。
美德:简短、声明式和功能性风格(递归,没有变化)。
潜在缺点:这可能不是您正在寻找的合并。请参阅文档字符串以了解语义。
def deep_merge(a, b):
"""
Merge two values, with `b` taking precedence over `a`.
Semantics:
- If either `a` or `b` is not a dictionary, `a` will be returned only if
`b` is `None`. Otherwise `b` will be returned.
- If both values are dictionaries, they are merged as follows:
* Each key that is found only in `a` or only in `b` will be included in
the output collection with its value intact.
* For any key in common between `a` and `b`, the corresponding values
will be merged with the same semantics.
"""
if not isinstance(a, dict) or not isinstance(b, dict):
return a if b is None else b
else:
# If we're here, both a and b must be dictionaries or subtypes thereof.
# Compute set of all keys in both dictionaries.
keys = set(a.keys()) | set(b.keys())
# Build output dictionary, merging recursively values with common keys,
# where `None` is used to mean the absence of a value.
return {
key: deep_merge(a.get(key), b.get(key))
for key in keys
}
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
49432 次 |
| 最近记录: |