在一个键上加入两个词典列表

Bac*_*con 26 python

鉴于n列表中包含m字典作为元素,我想生成一个新列表,其中包含一组连接的字典.每个字典都保证有一个名为"索引"的键,但除此之外可以有一组任意键.非索引键永远不会在列表中重叠.例如,想象以下两个列表:

l1 = [{"index":1, "b":2}, {"index":2, "b":3}, {"index":3, "green":"eggs"}]
l2 = [{"index":1, "c":4}, {"index":2, "c":5}]
Run Code Online (Sandbox Code Playgroud)

("b"永远不会出现l2,因为它出现了l1,同样地,它"c"永远不会出现l1,因为它出现在l2)

我想制作一个联合列表:

l3 = [{"index":1, "b":2, "c":4}, 
      {"index":2, "b":3, "c":5}, 
      {"index":3, "green":"eggs"}]
Run Code Online (Sandbox Code Playgroud)

在Python中执行此操作的最有效方法是什么?

eum*_*iro 36

from collections import defaultdict

l1 = [{"index":1, "b":2}, {"index":2, "b":3}, {"index":3, "green":"eggs"}]
l2 = [{"index":1, "c":4}, {"index":2, "c":5}]

d = defaultdict(dict)
for l in (l1, l2):
    for elem in l:
        d[elem['index']].update(elem)
l3 = d.values()

# l3 is now:

[{'b': 2, 'c': 4, 'index': 1},
 {'b': 3, 'c': 5, 'index': 2},
 {'green': 'eggs', 'index': 3}]
Run Code Online (Sandbox Code Playgroud)

编辑:由于l3不保证排序(.values()返回没有特定顺序的项目),你可以这样做@ user560833建议:

from operator import itemgetter

...

l3 = sorted(d.values(), key=itemgetter("index"))
Run Code Online (Sandbox Code Playgroud)

  • 之后您需要对l3进行排序 - 不保证列表将按索引顺序排列.例如`from operator import itemgetter; l3.sort(键= itemgetter( "索引"))` (2认同)

pau*_*ult 14

在 python 3.5 或更高版本中,您可以在单个语句中合并字典

因此,对于 python 3.5 或更高版本,一个快速的解决方案是:

from itertools import zip_longest

l3 = [{**u, **v} for u, v in zip_longest(l1, l2, fillvalue={})]

print(l3)
#[
#    {'index': 1, 'b': 2, 'c': 4}, 
#    {'index': 2, 'b': 3, 'c': 5}, 
#    {'index': 3, 'green': 'eggs'}
#]
Run Code Online (Sandbox Code Playgroud)

但是,如果两个列表的大小相同,则可以简单地使用 zip:

l3 = [{**u, **v} for u, v in zip(l1, l2)]
Run Code Online (Sandbox Code Playgroud)

注:这是假定列表进行排序以同样的方式index,这是由OP据称不会在一般的情况下

为了概括这种情况,一种方法是创建一个自定义的 zip-longest 类型函数,该函数仅在两个列表中的某个键匹配时才从这两个列表中生成值。

例如:

def sortedZipLongest(l1, l2, key, fillvalue={}):  
    l1 = iter(sorted(l1, key=lambda x: x[key]))
    l2 = iter(sorted(l2, key=lambda x: x[key]))
    u = next(l1, None)
    v = next(l2, None)

    while (u is not None) or (v is not None):  
        if u is None:
            yield fillvalue, v
            v = next(l2, None)
        elif v is None:
            yield u, fillvalue
            u = next(l1, None)
        elif u.get(key) == v.get(key):
            yield u, v
            u = next(l1, None)
            v = next(l2, None)
        elif u.get(key) < v.get(key):
            yield u, fillvalue
            u = next(l1, None)
        else:
            yield fillvalue, v
            v = next(l2, None)
Run Code Online (Sandbox Code Playgroud)

现在,如果您有以下乱序列表:

l1 = [{"index":1, "b":2}, {"index":2, "b":3}, {"index":3, "green":"eggs"}, 
      {"index":4, "b": 4}]
l2 = [{"index":1, "c":4}, {"index":2, "c":5}, {"index":0, "green": "ham"}, 
      {"index":4, "green": "ham"}]
Run Code Online (Sandbox Code Playgroud)

使用sortedZipLongest函数代替itertools.zip_longest

l3 = [{**u, **v} for u, v in sortedZipLongest(l1, l2, key="index", fillvalue={})]
print(l3)
#[{'index': 0, 'green': 'ham'},
# {'index': 1, 'b': 2, 'c': 4},
# {'index': 2, 'b': 3, 'c': 5},
# {'index': 3, 'green': 'eggs'},
# {'index': 4, 'b': 4, 'green': 'ham'}]
Run Code Online (Sandbox Code Playgroud)

而原始方法会产生不正确的答案:

l3 = [{**u, **v} for u, v in zip_longest(l1, l2, fillvalue={})]
print(l3)
#[{'index': 1, 'b': 2, 'c': 4},
# {'index': 2, 'b': 3, 'c': 5},
# {'index': 0, 'green': 'ham'},
# {'index': 4, 'b': 4, 'green': 'ham'}]
Run Code Online (Sandbox Code Playgroud)