从Python中的'enchant suggest()'获取最相关的单词(拼写检查)

Mag*_*gie 6 python spell-checking

我想从中得到最相关的词enchant suggest().有没有更好的方法来做到这一点.在检查100k或更大范围内的大量单词时,我觉得我的功能效率不高.

问题enchant suggest():

>>> import enchant
>>> d.suggest("prfomnc")
['prominence', 'performance', 'preform', 'Provence', 'preferment', 'proforma']

我的功能是从一组建议的单词中获取适当的单词:

import enchant, difflib

word="prfomnc"
dict,max = {},0
a = set(d.suggest(word))
for b in a:
    tmp = difflib.SequenceMatcher(None, word, b).ratio();
    dict[tmp] = b
    if tmp > max:
       max = tmp

print dict[max]

Result: performance

更新:

如果我得到多个键,意思是相同的difflib ratio() 值,我使用多键字典.如下所述:http://code.activestate.com/recipes/440502-a-dictionary-with-multiple-values-for-each-key/

Joh*_*ooy 2

dict如果您只对最佳匹配感兴趣,则实际上不需要保留 a

>>> word="prfomnc"
>>> best_words = []
>>> best_ratio = 0
>>> a = set(d.suggest(word))
>>> for b in a:
...   tmp = difflib.SequenceMatcher(None, word, b).ratio()
...   if tmp > best_ratio:
...     best_words = [b]
...     best_ratio = tmp
...   elif tmp == best_ratio:
...     best_words.append(b)
... 
>>> best_words
['performance']
Run Code Online (Sandbox Code Playgroud)