Mag*_*gie 6 python spell-checking
我想从中得到最相关的词enchant suggest().有没有更好的方法来做到这一点.在检查100k或更大范围内的大量单词时,我觉得我的功能效率不高.
问题enchant suggest():
>>> import enchant
>>> d.suggest("prfomnc")
['prominence', 'performance', 'preform', 'Provence', 'preferment', 'proforma']
我的功能是从一组建议的单词中获取适当的单词:
import enchant, difflib
word="prfomnc"
dict,max = {},0
a = set(d.suggest(word))
for b in a:
tmp = difflib.SequenceMatcher(None, word, b).ratio();
dict[tmp] = b
if tmp > max:
max = tmp
print dict[max]
Result: performance
更新:
如果我得到多个键,意思是相同的difflib ratio() 值,我使用多键字典.如下所述:http://code.activestate.com/recipes/440502-a-dictionary-with-multiple-values-for-each-key/
dict如果您只对最佳匹配感兴趣,则实际上不需要保留 a
>>> word="prfomnc"
>>> best_words = []
>>> best_ratio = 0
>>> a = set(d.suggest(word))
>>> for b in a:
... tmp = difflib.SequenceMatcher(None, word, b).ratio()
... if tmp > best_ratio:
... best_words = [b]
... best_ratio = tmp
... elif tmp == best_ratio:
... best_words.append(b)
...
>>> best_words
['performance']
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
3112 次 |
| 最近记录: |