小编Pre*_*ige的帖子

ValueError：nlp.add_pipe 现在采用已注册组件工厂的字符串名称，而不是可调用组件

以下链接展示了如何在实体跨越多个令牌的情况下添加自定义实体规则。执行此操作的代码如下：

import spacy
from spacy.pipeline import EntityRuler
nlp = spacy.load('en_core_web_sm', parse=True, tag=True, entity=True)

animal = ["cat", "dog", "artic fox"]
ruler = EntityRuler(nlp)
for a in animal:
    ruler.add_patterns([{"label": "animal", "pattern": a}])
nlp.add_pipe(ruler)


doc = nlp("There is no cat in the house and no artic fox in the basement")

with doc.retokenize() as retokenizer:
    for ent in doc.ents:
        retokenizer.merge(doc[ent.start:ent.end])


from spacy.matcher import Matcher
matcher = Matcher(nlp.vocab)
pattern =[{'lower': 'no'},{'ENT_TYPE': {'REGEX': 'animal', 'OP': '+'}}]
matcher.add('negated animal', None, pattern)
matches = matcher(doc)


for …

Run Code Online (Sandbox Code Playgroud)

python matcher spacy

Pre*_*ige

lucky-day

8
推荐指数

2
解决办法

2万
查看次数

如何使 spaCy 匹配大小写不敏感

如何使 spaCy 不区分大小写？

是否有任何我应该添加的代码片段或其他东西，因为我无法获取非大写的实体？

import spacy
import pandas as pd

from spacy.pipeline import EntityRuler
nlp = spacy.load('en_core_web_sm', disable = ['ner'])
ruler = nlp.add_pipe("entity_ruler")


flowers = ["rose", "tulip", "african daisy"]
for f in flowers:
    ruler.add_patterns([{"label": "flower", "pattern": f}])
animals = ["cat", "dog", "artic fox"]
for a in animals:
    ruler.add_patterns([{"label": "animal", "pattern": a}])



result={}
doc = nlp("CAT and Artic fox, plant african daisy")
for ent in doc.ents:
        result[ent.label_]=ent.text
df = pd.DataFrame([result])
print(df)

Run Code Online (Sandbox Code Playgroud)

python nlp pandas spacy

Pre*_*ige

2021 06-17

4
推荐指数

1
解决办法

2353
查看次数

用 pandas 选择前 n 列和后 n 列

我正在尝试使用 pandas 按索引从数据框中选择前 2 列和最后 2 列，并将其保存在同一数据框中。

有没有一种方法可以一步完成？

python dataframe pandas

Pre*_*ige

lucky-day

3
推荐指数

2
解决办法

5594
查看次数

使用熊猫将重复行与条件相加

我有一个看起来像这样的数据框：

  Name  rent  sale
0    A   180     2
1    B     1     4
2    M    12     1
3    O    10     1
4    A   180     5
5    M     2    19

Run Code Online (Sandbox Code Playgroud)

我想提出条件，如果我在列字段中有重复的行和重复的值=> 示例：

重复行 A在租金列中有重复值 180 我只保留一个（不计算总和）
或使总和=> 示例复制 A 行在销售列中具有不同的值 2 和 5，并在租金和销售列中复制具有不同值的行 M

预期输出：

Name rent sale 0 A 180 7 1 B 1 4 2 M 14 20 3 …
Run Code Online (Sandbox Code Playgroud)

python pandas

Pre*_*ige

2021 07-21

2
推荐指数

1
解决办法

65
查看次数

标签统计

python ×4

pandas ×3

spacy ×2

dataframe ×1

matcher ×1

nlp ×1

ValueError：nlp.add_pipe 现在采用已注册组件工厂的字符串名称，而不是可调用组件

如何使 spaCy 匹配大小写不敏感

用 pandas 选择前 n 列和后 n 列

使用熊猫将重复行与条件相加

标签 统计

小编Pre_ige的帖子

标签统计