166 python dictionary
假设我有两个Python词典 - dictA和dictB.我需要找出是否有任何键存在dictB但不存在dictA.最快的方法是什么?
我应该将字典键转换成一组然后再去吗?
有兴趣了解你的想法......
谢谢你的回复.
抱歉没有正确陈述我的问题.我的情况是这样的 - 我有一个dictA可以相同dictB或者可能有一些键缺失,dictB或者某些键的值可能不同,必须设置为dictA键的值.
问题是字典没有标准,并且可以具有可以作为dict字典的值.
说
dictA={'key1':a, 'key2':b, 'key3':{'key11':cc, 'key12':dd}, 'key4':{'key111':{....}}}
dictB={'key1':a, 'key2:':newb, 'key3':{'key11':cc, 'key12':newdd, 'key13':ee}.......
Run Code Online (Sandbox Code Playgroud)
因此'key2'值必须重置为新值,并且必须在dict中添加'key13'.键值没有固定格式.它可以是一个简单的价值,也可以是dict的dict或dict.
hug*_*own 233
您可以在键上使用set操作:
diff = set(dictb.keys()) - set(dicta.keys())
Run Code Online (Sandbox Code Playgroud)
这是一个可以找到所有可能性的类:添加的内容,删除的内容,相同的键值对以及更改的键值对.
class DictDiffer(object):
"""
Calculate the difference between two dictionaries as:
(1) items added
(2) items removed
(3) keys same in both but changed values
(4) keys same in both and unchanged values
"""
def __init__(self, current_dict, past_dict):
self.current_dict, self.past_dict = current_dict, past_dict
self.set_current, self.set_past = set(current_dict.keys()), set(past_dict.keys())
self.intersect = self.set_current.intersection(self.set_past)
def added(self):
return self.set_current - self.intersect
def removed(self):
return self.set_past - self.intersect
def changed(self):
return set(o for o in self.intersect if self.past_dict[o] != self.current_dict[o])
def unchanged(self):
return set(o for o in self.intersect if self.past_dict[o] == self.current_dict[o])
Run Code Online (Sandbox Code Playgroud)
这是一些示例输出:
>>> a = {'a': 1, 'b': 1, 'c': 0}
>>> b = {'a': 1, 'b': 2, 'd': 0}
>>> d = DictDiffer(b, a)
>>> print "Added:", d.added()
Added: set(['d'])
>>> print "Removed:", d.removed()
Removed: set(['c'])
>>> print "Changed:", d.changed()
Changed: set(['b'])
>>> print "Unchanged:", d.unchanged()
Unchanged: set(['a'])
Run Code Online (Sandbox Code Playgroud)
可用作github repo:https: //github.com/hughdbrown/dictdiffer
Sep*_*man 56
如果你想要递归的差异,我已经为python编写了一个包:https: //github.com/seperman/deepdiff
从PyPi安装:
pip install deepdiff
Run Code Online (Sandbox Code Playgroud)
输入
>>> from deepdiff import DeepDiff
>>> from pprint import pprint
>>> from __future__ import print_function # In case running on Python 2
Run Code Online (Sandbox Code Playgroud)
同一对象返回空
>>> t1 = {1:1, 2:2, 3:3}
>>> t2 = t1
>>> print(DeepDiff(t1, t2))
{}
Run Code Online (Sandbox Code Playgroud)
项目类型已更改
>>> t1 = {1:1, 2:2, 3:3}
>>> t2 = {1:1, 2:"2", 3:3}
>>> pprint(DeepDiff(t1, t2), indent=2)
{ 'type_changes': { 'root[2]': { 'newtype': <class 'str'>,
'newvalue': '2',
'oldtype': <class 'int'>,
'oldvalue': 2}}}
Run Code Online (Sandbox Code Playgroud)
项目的价值已更改
>>> t1 = {1:1, 2:2, 3:3}
>>> t2 = {1:1, 2:4, 3:3}
>>> pprint(DeepDiff(t1, t2), indent=2)
{'values_changed': {'root[2]': {'newvalue': 4, 'oldvalue': 2}}}
Run Code Online (Sandbox Code Playgroud)
添加和/或删除项目
>>> t1 = {1:1, 2:2, 3:3, 4:4}
>>> t2 = {1:1, 2:4, 3:3, 5:5, 6:6}
>>> ddiff = DeepDiff(t1, t2)
>>> pprint (ddiff)
{'dic_item_added': ['root[5]', 'root[6]'],
'dic_item_removed': ['root[4]'],
'values_changed': {'root[2]': {'newvalue': 4, 'oldvalue': 2}}}
Run Code Online (Sandbox Code Playgroud)
字符串差异
>>> t1 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":"world"}}
>>> t2 = {1:1, 2:4, 3:3, 4:{"a":"hello", "b":"world!"}}
>>> ddiff = DeepDiff(t1, t2)
>>> pprint (ddiff, indent = 2)
{ 'values_changed': { 'root[2]': {'newvalue': 4, 'oldvalue': 2},
"root[4]['b']": { 'newvalue': 'world!',
'oldvalue': 'world'}}}
Run Code Online (Sandbox Code Playgroud)
字符串差异2
>>> t1 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":"world!\nGoodbye!\n1\n2\nEnd"}}
>>> t2 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":"world\n1\n2\nEnd"}}
>>> ddiff = DeepDiff(t1, t2)
>>> pprint (ddiff, indent = 2)
{ 'values_changed': { "root[4]['b']": { 'diff': '--- \n'
'+++ \n'
'@@ -1,5 +1,4 @@\n'
'-world!\n'
'-Goodbye!\n'
'+world\n'
' 1\n'
' 2\n'
' End',
'newvalue': 'world\n1\n2\nEnd',
'oldvalue': 'world!\n'
'Goodbye!\n'
'1\n'
'2\n'
'End'}}}
>>>
>>> print (ddiff['values_changed']["root[4]['b']"]["diff"])
---
+++
@@ -1,5 +1,4 @@
-world!
-Goodbye!
+world
1
2
End
Run Code Online (Sandbox Code Playgroud)
输入更改
>>> t1 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":[1, 2, 3]}}
>>> t2 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":"world\n\n\nEnd"}}
>>> ddiff = DeepDiff(t1, t2)
>>> pprint (ddiff, indent = 2)
{ 'type_changes': { "root[4]['b']": { 'newtype': <class 'str'>,
'newvalue': 'world\n\n\nEnd',
'oldtype': <class 'list'>,
'oldvalue': [1, 2, 3]}}}
Run Code Online (Sandbox Code Playgroud)
列表差异
>>> t1 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":[1, 2, 3, 4]}}
>>> t2 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":[1, 2]}}
>>> ddiff = DeepDiff(t1, t2)
>>> pprint (ddiff, indent = 2)
{'iterable_item_removed': {"root[4]['b'][2]": 3, "root[4]['b'][3]": 4}}
Run Code Online (Sandbox Code Playgroud)
清单差异2:
>>> t1 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":[1, 2, 3]}}
>>> t2 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":[1, 3, 2, 3]}}
>>> ddiff = DeepDiff(t1, t2)
>>> pprint (ddiff, indent = 2)
{ 'iterable_item_added': {"root[4]['b'][3]": 3},
'values_changed': { "root[4]['b'][1]": {'newvalue': 3, 'oldvalue': 2},
"root[4]['b'][2]": {'newvalue': 2, 'oldvalue': 3}}}
Run Code Online (Sandbox Code Playgroud)
列出差异忽略顺序或重复:(使用与上面相同的词典)
>>> t1 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":[1, 2, 3]}}
>>> t2 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":[1, 3, 2, 3]}}
>>> ddiff = DeepDiff(t1, t2, ignore_order=True)
>>> print (ddiff)
{}
Run Code Online (Sandbox Code Playgroud)
包含字典的列表:
>>> t1 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":[1, 2, {1:1, 2:2}]}}
>>> t2 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":[1, 2, {1:3}]}}
>>> ddiff = DeepDiff(t1, t2)
>>> pprint (ddiff, indent = 2)
{ 'dic_item_removed': ["root[4]['b'][2][2]"],
'values_changed': {"root[4]['b'][2][1]": {'newvalue': 3, 'oldvalue': 1}}}
Run Code Online (Sandbox Code Playgroud)
集:
>>> t1 = {1, 2, 8}
>>> t2 = {1, 2, 3, 5}
>>> ddiff = DeepDiff(t1, t2)
>>> pprint (DeepDiff(t1, t2))
{'set_item_added': ['root[3]', 'root[5]'], 'set_item_removed': ['root[8]']}
Run Code Online (Sandbox Code Playgroud)
命名元组:
>>> from collections import namedtuple
>>> Point = namedtuple('Point', ['x', 'y'])
>>> t1 = Point(x=11, y=22)
>>> t2 = Point(x=11, y=23)
>>> pprint (DeepDiff(t1, t2))
{'values_changed': {'root.y': {'newvalue': 23, 'oldvalue': 22}}}
Run Code Online (Sandbox Code Playgroud)
自定义对象:
>>> class ClassA(object):
... a = 1
... def __init__(self, b):
... self.b = b
...
>>> t1 = ClassA(1)
>>> t2 = ClassA(2)
>>>
>>> pprint(DeepDiff(t1, t2))
{'values_changed': {'root.b': {'newvalue': 2, 'oldvalue': 1}}}
Run Code Online (Sandbox Code Playgroud)
添加了对象属性:
>>> t2.c = "new attribute"
>>> pprint(DeepDiff(t1, t2))
{'attribute_added': ['root.c'],
'values_changed': {'root.b': {'newvalue': 2, 'oldvalue': 1}}}
Run Code Online (Sandbox Code Playgroud)
gho*_*g74 18
不确定它是否"快",但通常情况下,可以做到这一点
dicta = {"a":1,"b":2,"c":3,"d":4}
dictb = {"a":1,"d":2}
for key in dicta.keys():
if not key in dictb:
print key
Run Code Online (Sandbox Code Playgroud)
Joc*_*zel 15
正如Alex Martelli写的那样,如果你只想检查B中的任何一个键是否不在A中,any(True for k in dictB if k not in dictA)那么将是要走的路.
要找到丢失的密钥:
diff = set(dictB)-set(dictA) #sets
C:\Dokumente und Einstellungen\thc>python -m timeit -s "dictA =
dict(zip(range(1000),range
(1000))); dictB = dict(zip(range(0,2000,2),range(1000)))" "diff=set(dictB)-set(dictA)"
10000 loops, best of 3: 107 usec per loop
diff = [ k for k in dictB if k not in dictA ] #lc
C:\Dokumente und Einstellungen\thc>python -m timeit -s "dictA =
dict(zip(range(1000),range
(1000))); dictB = dict(zip(range(0,2000,2),range(1000)))" "diff=[ k for k in dictB if
k not in dictA ]"
10000 loops, best of 3: 95.9 usec per loop
Run Code Online (Sandbox Code Playgroud)
所以这两种解决方案的速度几乎相同.
Ale*_*lli 13
如果你真正的意思是你所说的(你只需要找出中间的"有任何钥匙"而不是A中,如果有的话可能不是那些,那么最快的方法应该是:
if any(True for k in dictB if k not in dictA): ...
Run Code Online (Sandbox Code Playgroud)
如果你真的需要找出哪个键,如果有的话,在B而不在A中,而不只是"IF"有这样的键,那么现有的答案是非常合适的(但我确实建议在未来的问题中更精确,如果这是的确是你的意思;-).
hughdbrown的最佳答案建议使用set difference,这绝对是最好的方法:
diff = set(dictb.keys()) - set(dicta.keys())
Run Code Online (Sandbox Code Playgroud)
这段代码的问题在于它构建两个列表只是为了创建两个集合,因此它浪费了4N时间和2N空间.它也比它需要的复杂一点.
通常,这不是什么大问题,但如果是:
diff = dictb.keys() - dicta
Run Code Online (Sandbox Code Playgroud)
collections.abc.Mapping都有一个KeysView像a的行为Set.在Python 2中,keys()返回键的列表,而不是a KeysView.所以你必须viewkeys()直接要求.
diff = dictb.viewkeys() - dicta
Run Code Online (Sandbox Code Playgroud)
对于双版本2.7/3.x代码,您希望使用six或类似的东西,所以您可以使用six.viewkeys(dictb):
diff = six.viewkeys(dictb) - dicta
Run Code Online (Sandbox Code Playgroud)
在2.4-2.6中,没有KeysView.但是你可以通过直接从迭代器中构建你的左集来减少从4N到N的成本,而不是先建立一个列表:
diff = set(dictb) - dicta
Run Code Online (Sandbox Code Playgroud)
我有可能是相同的dictB或可能有一些按键相比dictB或者某些键的值可能是不同的缺少格言
所以你真的不需要比较键,而是项目.一个ItemsView只是Set如果值是哈希的,像字符串.如果是,那很简单:
diff = dictb.items() - dicta.items()
Run Code Online (Sandbox Code Playgroud)
虽然问题不是直接要求递归diff,但是一些示例值是dicts,并且看起来预期的输出会递归地区分它们.这里已经有多个答案显示了如何做到这一点.