Jua*_*rez 3 python nlp porter-stemmer nltk
我想获得字符串的动名词形式。我还没有找到调用库来获取动名词的直接方法。
我应用了以“ing”结尾的单词的规则,但是因为异常导致我收到了一些错误。然后,我检查 cmu 单词以确保生成的动名词单词正确。代码如下:
import cmudict
import re
ing= 'ing'
vowels = "aeiou"
consonants = "bcdfghjklmnpqrstvwxyz"
words=['lead','take','hit','begin','stop','refer','visit']
cmu_words= cmudict.words()
g_w = []
for word in words:
if word[-1] == 'e':
if word[:-1] + ing in cmu_words:
g_w.append(word[:-1] + ing)
elif count_syllables(word) == 1 and word[-2] in vowels and word[-1] in consonants:
if word.__len__()>2 and word[-3] in vowels:
if word + ing in cmu_words:
g_w.append(word + ing)
else:
if word + word[-1] + ing in cmu_words:
g_w.append(word + word[-1] + ing)
elif count_syllables(word)>1 and word[-2] in vowels and word[-1] in consonants:
if word + word[-1]+ ing in cmu_words:
g_w.append(word + word[-1]+ ing)
else:
if word + ing in cmu_words:
g_w.append(word + ing)
print(g_w)
Run Code Online (Sandbox Code Playgroud)
规则如下:
when a verb ends in "e", drop the "e" and add "-ing". For example: "take + ing = taking".
when a one-syllable verb ends in vowel + consonant, double the final consonant and add "-ing". For example: "hit + ing = hitting".
When a verb ends in vowel + consonant with stress on the final syllable, double the consonant and add "-ing". For example: "begin + ing = beginning".
Do not double the consonant of words with more than one syllable if the stress is not on the final
Run Code Online (Sandbox Code Playgroud)
是否有更有效的方法来获取字符串的动名词(如果存在)?
谢谢
也许这就是您正在寻找的。图书馆名为pyinflect
一个用于单词变形的 python 模块,用作 spaCy 扩展。要独立使用,请导入方法 getAllInflections 和/或 getInflection 并直接调用它们。getInflection 方法采用引理和 Penn Treebank 标记,并返回与其关联的特定变形的元组。
有多种标签可用于获取词形变化,包括您正在寻找的“VBG”标签(动词、动名词)。
pos_type = 'A'
* JJ Adjective
* JJR Adjective, comparative
* JJS Adjective, superlative
* RB Adverb
* RBR Adverb, comparative
* RBS Adverb, superlative
pos_type = 'N'
* NN Noun, singular or mass
* NNS Noun, plural
pos_type = 'V'
* VB Verb, base form
* VBD Verb, past tense
* VBG Verb, gerund or present participle
* VBN Verb, past participle
* VBP Verb, non-3rd person singular present
* VBZ Verb, 3rd person singular present
* MD Modal
Run Code Online (Sandbox Code Playgroud)
这是一个示例实现。
#!pip install pyinflect
from pyinflect import getInflection
words = ['lead','take','hit','begin','stop','refer','visit']
[getInflection(i, 'VBG') for i in words]
Run Code Online (Sandbox Code Playgroud)
[('leading',),
('taking',),
('hitting',),
('beginning',),
('stopping', 'stoping'),
('referring',),
('visiting',)]
Run Code Online (Sandbox Code Playgroud)
注意:作者设置了一个更复杂和基准化的库,它可以进行词形还原和词形变化,称为LemmInflect. 如果您想要比上述库更可靠的东西,请检查一下。语法与上面几乎相同。
| 归档时间: |
|
| 查看次数: |
1346 次 |
| 最近记录: |