Python - 在字典中查找值的交集

Jon*_*ier 2 python dictionary intersection

我正在编写一个函数来处理布尔AND搜索中的多个查询.我有一个文档的词典,每个查询发生=query_dict

我想在query_dict.values()中的所有值的交集:

query_dict = {'foo': ['doc_one.txt', 'doc_two.txt', 'doc_three.txt'],
              'bar': ['doc_one.txt', 'doc_two.txt'],
              'foobar': ['doc_two.txt']}

intersect(query_dict)

>> doc_two.txt
Run Code Online (Sandbox Code Playgroud)

我一直在读关于交叉的但是我发现很难将它应用于字典.

谢谢你的帮助!

ins*_*get 10

In [36]: query_dict = {'foo': ['doc_one.txt', 'doc_two.txt', 'doc_three.txt'],
              'bar': ['doc_one.txt', 'doc_two.txt'],
              'foobar': ['doc_two.txt']}

In [37]: reduce(set.intersection, (set(val) for val in query_dict.values()))
Out[37]: set(['doc_two.txt'])
Run Code Online (Sandbox Code Playgroud)

在[41]中:query_dict = {'foo':['doc_one.txt','doc_two.txt','doc_three.txt'],'bar':['doc_one.txt','doc_two.txt'], 'foobar':['doc_two.txt']}

set.intersection(*(set(val) for val in query_dict.values())) 也是一个有效的解决方案,虽然它有点慢:

In [42]: %timeit reduce(set.intersection, (set(val) for val in query_dict.values()))
100000 loops, best of 3: 2.78 us per loop

In [43]: %timeit set.intersection(*(set(val) for val in query_dict.values()))
100000 loops, best of 3: 3.28 us per loop
Run Code Online (Sandbox Code Playgroud)

  • 而不是`reduce`,`set.intersection(*(set(val)..等))`也应该工作. (2认同)
  • 如果你比较输入额外字符所需的时间,取决于你执行操作的次数. (2认同)