我如何使用斯坦福NER(命名实体识别器)的python接口?

Vin*_*wad 14 nlp named-entity-recognition stanford-nlp python-2.7

我想使用pyner库在python中使用Stanford NER.这是一个基本的代码片段.

import ner 
tagger = ner.HttpNER(host='localhost', port=80)
tagger.get_entities("University of California is located in California, United States")
Run Code Online (Sandbox Code Playgroud)

当我在我的本地python控制台(IDLE)上运行它.它应该给我这样的输出

  {'LOCATION': ['California', 'United States'],
 'ORGANIZATION': ['University of California']}
Run Code Online (Sandbox Code Playgroud)

但当我执行此操作时,它显示空括号.我实际上是新手.

Rya*_*ill 27

我可以使用以下方式在套接字模式下运行stanford-ner服务器:

java -mx1000m -cp stanford-ner.jar edu.stanford.nlp.ie.NERServer \
    -loadClassifier classifiers/english.muc.7class.distsim.crf.ser.gz \
    -port 8080 -outputFormat inlineXML
Run Code Online (Sandbox Code Playgroud)

并从命令行接收以下输出:

Loading classifier from 
/Users/roneill/stanford-ner-2012-11-11/classifiers/english.muc.7class.distsim.crf.ser.gz 
... done [1.7 sec].
Run Code Online (Sandbox Code Playgroud)

然后在python repl中:

Python 2.7.2 (default, Jun 20 2012, 16:23:33) 
[GCC 4.2.1 Compatible Apple Clang 4.0 (tags/Apple/clang-418.0.60)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import ner
>>> tagger = ner.SocketNER(host='localhost', port=8080)
>>> tagger.get_entities("University of California is located in California, United States")
{'ORGANIZATION': ['University of California'], 'LOCATION': ['California', 'United States']}
Run Code Online (Sandbox Code Playgroud)

  • 对我来说,问题不包括`-outputFormat inlineXML`(我在pyner README示例中没有看到任何内容).非常感谢你. (4认同)