是否可以使用 nltk 将 running、helps、cooks、finds 和 happy 等词更改为 run、help、cook、find 和 happy?
>>> from nltk.stem import WordNetLemmatizer
>>> wnl = WordNetLemmatizer()
>>> ls = ['running', 'helping', 'cooks', 'finds']
>>> [wnl.lemmatize(i) for i in ls]
['running', 'helping', u'cook', u'find']
>>> ls = [('running', 'v'), ('helping', 'v'), ('cooks', 'v'), ('finds','v')]
>>> [wnl.lemmatize(word, pos) for word, pos in ls]
[u'run', u'help', u'cook', u'find']
>>> ls = [('running', 'n'), ('helping', 'n'), ('cooks', 'n'), ('finds','n')]
>>> [wnl.lemmatize(word, pos) for word, pos in ls]
['running', 'helping', u'cook', u'find']
Run Code Online (Sandbox Code Playgroud)
有一些词干算法在nltk. 看起来Lancaster词干算法对你有用。
>>> from nltk.stem.lancaster import LancasterStemmer
>>> st = LancasterStemmer()
>>> st.stem('happily')
'happy'
>>> st.stem('cooks')
'cook'
>>> st.stem('helping')
'help'
>>> st.stem('running')
'run'
>>> st.stem('finds')
'find'
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
7298 次 |
| 最近记录: |