NLTK中的"ImportError:无法导入名称StanfordNERTagger"

eag*_*arn 9 python nlp nltk

我无法在NLTK中导入NER Stanford Tagger.这就是我所做的:

这里下载了java代码, 并添加了一个环境变量STANFORD_MODELS,其中包含存储java代码的文件夹的路径.

根据NLTK网站上提供的信息,这应该足够了.它说:

"Tagger模型需要从http://nlp.stanford.edu/software和STANFORD_MODELS环境变量集(以冒号分隔的路径列表)下载."

请问有人帮助我吗?

编辑:下载的文件夹位于/ Users/-----------/Documents/JavaJuno/stanford-ner-2015-04-20并包含以下文件:

LICENSE.txt         lib             ner.sh              stanford-ner-3.5.2-javadoc.jar
NERDemo.java            ner-gui.bat         sample-conll-file.txt       stanford-ner-3.5.2-sources.jar
README.txt          ner-gui.command         sample-w-time.txt       stanford-ner-3.5.2.jar
build.xml           ner-gui.sh          sample.ner.txt          stanford-ner.jar
classifiers         ner.bat             sample.txt
Run Code Online (Sandbox Code Playgroud)

然后我添加了一个环境变量STANFORD_MODELS:

os.environ["STANFORD_MODELS"] = "/Users/-----------/Documents/JavaJuno/stanford-ner-2015-04-20"
Run Code Online (Sandbox Code Playgroud)

从nltk.tag导入调用StanfordNERTagger会产生错误:

ImportError                               Traceback (most recent call last)
<ipython-input-356-f4287e573edc> in <module>()
----> 1 from nltk.tag import StanfordNERTagger

ImportError: cannot import name StanfordNERTagger
Run Code Online (Sandbox Code Playgroud)

如果这可能是相关的,这就是我的nltk.tag文件夹中的内容:

__init__.py api.pyc     crf.py      hmm.pyc     senna.py    sequential.pyc  stanford.py tnt.pyc
__init__.pyc    brill.py    crf.pyc     hunpos.py   senna.pyc   simplify.py stanford.pyc    util.py
api.py      brill.pyc   hmm.py      hunpos.pyc  sequential.py   simplify.pyc    tnt.py      util.pyc
Run Code Online (Sandbox Code Playgroud)

EDIT2:我已经设法导入NER Tagger,使用:

from nltk.tag.stanford import NERTagger
Run Code Online (Sandbox Code Playgroud)

但是现在当从NLTK网站调用一个示例调用时,我得到:

In [360]: st = NERTagger('english.all.3class.distsim.crf.ser.gz')
---------------------------------------------------------------------------
LookupError                               Traceback (most recent call last)
<ipython-input-360-0c0ab770b0ff> in <module>()
----> 1 st = NERTagger('english.all.3class.distsim.crf.ser.gz')

/Library/Python/2.7/site-packages/nltk/tag/stanford.pyc in __init__(self, *args, **kwargs)
    158 
    159     def __init__(self, *args, **kwargs):
--> 160         super(NERTagger, self).__init__(*args, **kwargs)
    161 
    162     @property

/Library/Python/2.7/site-packages/nltk/tag/stanford.pyc in __init__(self, path_to_model, path_to_jar, encoding, verbose, java_options)
     40                 self._JAR, path_to_jar,
     41                 searchpath=(), url=_stanford_url,
---> 42                 verbose=verbose)
     43 
     44         self._stanford_model = find_file(path_to_model,

/Library/Python/2.7/site-packages/nltk/__init__.pyc in find_jar(name, path_to_jar, env_vars, searchpath, url, verbose)
    595                     (name, url))
    596     div = '='*75
--> 597     raise LookupError('\n\n%s\n%s\n%s' % (div, msg, div))
    598 
    599 ##########################################################################

LookupError: 

===========================================================================
  NLTK was unable to find stanford-ner.jar! Set the CLASSPATH
  environment variable.

  For more information, on stanford-ner.jar, see:
    <http://nlp.stanford.edu/software>
===========================================================================
Run Code Online (Sandbox Code Playgroud)

所以我错误地设置了环境变量.任何人都可以帮助我吗?

Sky*_*326 5

我把它解决了.

  1. 像你一样设置STANFORD_MODELS#我向你学习,thx!
  2. 将nltk.tag.stanford导入为st
  3. tagger = st.StanfordNERTagger(PATH_TO_GZ,PATH_TO_JAR)#这里PATH_TO_GZ和PATH_TO_JAR是我存储文件"all.3class.distsim.crf.ser.gz"和文件"stanford-ner.jar"的完整路径
  4. 现在标记器可用了.#test tagger.tag('Rami Eid正在纽约斯托尼布鲁克大学学习.)

它与CLASSPATH无关.

希望能帮助到你!