如何根据字典列表中的另一个值有效地查找字典值

Question

如何根据字典列表中的另一个值有效地查找字典值

我有一个非常大（~100k）的字典列表：

[{'sequence': 'read the rest of this note', 'score': 0.22612378001213074, 'token': 3805, 'token_str': 'note'}, {'sequence': 'read the rest of this page', 'score': 0.11293990164995193, 'token': 3674, 'token_str': 'page'}, {'sequence': 'read the rest of this week', 'score': 0.06504543870687485, 'token': 1989, 'token_str': 'week'}]

Run Code Online (Sandbox Code Playgroud)

给定一个tokenID（例如1989），我如何才能score有效地找到对应的ID？我必须为每个列表多次执行此操作（我有几个这样的大列表，并且对于每个列表我都有几个令牌 ID）。

我目前正在迭代列表中的每个字典并检查它们是否与ID我的输入 ID 匹配，如果匹配，我将获得score. 但速度相当慢。

Answer 1

Ped*_*aia 5

由于您必须多次搜索，因此可能会创建一个以令牌作为键的字典：

a = [{'sequence': 'read the rest of this note', 'score': 0.22612378001213074, 'token': 3805, 'token_str': 'note'}, {'sequence': 'read the rest of this page', 'score': 0.11293990164995193, 'token': 3674, 'token_str': 'page'}, {'sequence': 'read the rest of this week', 'score': 0.06504543870687485, 'token': 1989, 'token_str': 'week'}]

my_dict = {i['token']: i for i in a}

Run Code Online (Sandbox Code Playgroud)

创建需要一些时间，dict但每次搜索后都会是O(1)。

这可能看起来效率低下，但 python 以非常有效的方式处理内存，因此它实际上保存了对列表上已构造的引用，而不是在list新的上创建相同的字典，您可以确认使用：dictdict

>>> a[0] is my_dict[3805]
True

Run Code Online (Sandbox Code Playgroud)

因此，您可以将其解释为为列表中的每个元素创建别名。

归档时间：	4 年，1 月前
查看次数：	887 次
最近记录：	4 年，1 月前