小编Shi*_*rma的帖子

使用 spacy 3 执行 nlp.add_pipe(LanguageDetector(), name='language_ detector', last=True) 时如何修复 ValueError

每次我运行在 Kaggle 上找到的以下代码时，我都会得到ValueError. 这是因为SpaCy新版本 v3 的缘故：

import scispacy
import spacy
import en_core_sci_lg
from spacy_langdetect import LanguageDetector

nlp = en_core_sci_lg.load(disable=["tagger", "ner"])
nlp.max_length = 2000000
nlp.add_pipe(LanguageDetector(), name='language_detector', last=True)

Run Code Online (Sandbox Code Playgroud)

ValueError: [E966]nlp.add_pipe现在采用已注册组件工厂的字符串名称，而不是可调用组件。预期为字符串，但在 0x00000216BB4C8D30> 处获得 <spacy_langdetect.spacy_langdetect.LanguageDetector 对象（名称：“language_detector”）。

如果您使用以下命令创建组件nlp.create_pipe('name')：删除 nlp.create_pipe 并nlp.add_pipe('name')改为调用。
如果您传入类似TextCategorizer(): 的组件，请使用字符串名称进行调用nlp.add_pipe，例如nlp.add_pipe('textcat').
如果您使用自定义组件：将装饰器@Language.component（对于函数组件）或@Language.factory（对于类组件/工厂）添加到自定义组件并为其指定一个名称，例如@Language.component('your_name')。然后您可以运行nlp.add_pipe('your_name')将其添加到管道中。

我已经安装了这些版本：

python_version : 3.8.5
spacy.version  : '3.0.3'
scispacy.version  :  '0.4.0'
en_core_sci_lg.version  :  '0.4.0'

Run Code Online (Sandbox Code Playgroud)

python language-detection spacy

Shi*_*rma

2022 10-03

9
推荐指数

2
解决办法

1万
查看次数

警告：[W108] 基于规则的词形还原器未找到标记“This”的 POS 注释

这条消息是关于什么的？如何删除此警告消息。谢谢。

            import scispacy
            import spacy
            import en_core_sci_lg
            from spacy_langdetect import LanguageDetector
            from spacy.language import Language
            from spacy.tokens import Doc
             
            def create_lang_detector(nlp, name):
                return LanguageDetector()
            
            Language.factory("language_detector", func=create_lang_detector)
            nlp = en_core_sci_lg.load(disable=["tagger", "ner"])
            nlp.max_length = 2000000
            nlp.add_pipe('language_detector', last=True)
            
            doc = nlp('This is some English text. Das ist ein Haus. This is a house.')

Run Code Online (Sandbox Code Playgroud)

警告：

[W108] 基于规则的词形还原器未找到标记“This”的 POS 注释。检查您的管道是否包含分配 token.pos 的组件，通常是 'tagger'+'attribute_ruler' 或 'morphologizer'。

[W108] 基于规则的词形还原器未找到标记“is”的 POS 注释。检查您的管道是否包含分配 token.pos 的组件，通常是 'tagger'+'attribute_ruler' 或 'morphologizer'。

[W108] 基于规则的词形还原器没有找到标记“some”的 POS 注释。检查您的管道是否包含分配 token.pos 的组件，通常是 'tagger'+'attribute_ruler' 或 'morphologizer'。
. . …

python spacy

Shi*_*rma

lucky-day

4
推荐指数

1
解决办法

1613
查看次数

标签统计

python ×2

spacy ×2

language-detection ×1

使用 spacy 3 执行 nlp.add_pipe(LanguageDetector(), name='language_ detector', last=True) 时如何修复 ValueError

警告：[W108] 基于规则的词形还原器未找到标记“This”的 POS 注释

标签 统计

小编Shi_rma的帖子

标签统计