我想用代词替换句子中的名词。我将使用它为 NLP 任务创建数据集。例如,如果我的句子是 -->
“杰克和瑞恩是朋友。杰克也是米歇尔的朋友。”
然后我想用“他”替换第二个杰克(斜体和粗体)。我已经完成了词性标注以在我的句子中找到名词。但我不知道如何从这里开始。如果我有一个可以使用的所有可能代词的列表,是否有一个语料库或系统可以告诉我最适合这个词的代词?
我一直在尝试使用这个库neuralcoref:基于神经网络和 spaCy 的最先进的共指解析。我在 conda 1.9.7 和 Spacy 2.2.4 中使用 Ubuntu 16.04、Python 3.7.3。
我的代码(来自https://spacy.io/universe/project/neuralcoref):
import spacy
import neuralcoref
nlp = spacy.load('en_core_web_sm')
neuralcoref.add_to_pipe(nlp)
doc1 = nlp('My sister has a dog. She loves him.')
print(doc1._.coref_clusters)
doc2 = nlp('Angela lives in Boston. She is quite happy in that city.')
for ent in doc2.ents:
print(ent._.coref_cluster)
Run Code Online (Sandbox Code Playgroud)
我有这个错误
/home/daniel/anaconda3/lib/python3.7/importlib/_bootstrap.py:219: RuntimeWarning: spacy.morphology.Morphology size changed, may indicate binary incompatibility. Expected 104 from C header, got 112 from PyObject
return f(*args, **kwds)
/home/daniel/anaconda3/lib/python3.7/importlib/_bootstrap.py:219: RuntimeWarning: spacy.vocab.Vocab size …Run Code Online (Sandbox Code Playgroud) 我想做一个嘈杂的解决方案,以便给定一个人称代词,该代词被前一个(最近的)人代替。
例如:
Alex is looking at buying a U.K. startup for $1 billion. He is very confident that this is going to happen. Sussan is also in the same situation. However, she has lost hope.
输出是:
Alex is looking at buying a U.K. startup for $1 billion. Alex is very confident that this is going to happen. Sussan is also in the same situation. However, Susan has lost hope.
另一个例子,
Peter is a friend of Gates. But Gates does not …