我什么时候使用?
另外...... NLTK的词形还原取决于词性?如果它是不是更准确?
我想提取这样的词:
a dog ==> dog
some dogs ==> dog
dogmatic ==> None
Run Code Online (Sandbox Code Playgroud)
有一个类似的链接: 从pandas DataFrame中的文本中提取子字符串作为新列
但这不能满足我的要求。
从此数据帧:
df = pd.DataFrame({'comment': ['A likes cat', 'B likes Cats',
'C likes cats.', 'D likes cat!',
'E is educated',
'F is catholic',
'G likes cat, he has three of them.',
'H likes cat; he has four of them.',
'I adore !!cats!!',
'x is dogmatic',
'x is eating hotdogs.',
'x likes dogs, he has three of them.',
'x likes dogs; he has four of them.', …
Run Code Online (Sandbox Code Playgroud)