创建StanfordCoreNLP对象时出错

Loh*_*que 5 java nlp jar stanford-nlp maven

我已经从http://nlp.stanford.edu/software/corenlp.shtml#Download下载并安装了所需的jar文件.

我已经包含了五个jar文件

Satnford-postagger.jar

斯坦福大学psotagger-3.3.1.jar

斯坦福大学psotagger-3.3.1.jar,javadoc.jar

斯坦福大学psotagger-3.3.1.jar-src.jar

斯坦福大学corenlp-3.3.1.jar

而代码是

public class lemmafirst {

    protected StanfordCoreNLP pipeline;

    public lemmafirst() {
        // Create StanfordCoreNLP object properties, with POS tagging
        // (required for lemmatization), and lemmatization
        Properties props;
        props = new Properties();
        props.put("annotators", "tokenize, ssplit, pos, lemma");

        /*
         * This is a pipeline that takes in a string and returns various analyzed linguistic forms. 
         * The String is tokenized via a tokenizer (such as PTBTokenizerAnnotator), 
         * and then other sequence model style annotation can be used to add things like lemmas, 
         * POS tags, and named entities. These are returned as a list of CoreLabels. 
         * Other analysis components build and store parse trees, dependency graphs, etc. 
         * 
         * This class is designed to apply multiple Annotators to an Annotation. 
         * The idea is that you first build up the pipeline by adding Annotators, 
         * and then you take the objects you wish to annotate and pass them in and 
         * get in return a fully annotated object.
         * 
         *  StanfordCoreNLP loads a lot of models, so you probably
         *  only want to do this once per execution
         */
        ***this.pipeline = new StanfordCoreNLP(props);***
}
Run Code Online (Sandbox Code Playgroud)

我的问题是创建一个pipline.

我得到的错误是:

Exception in thread "main" java.lang.RuntimeException: edu.stanford.nlp.io.RuntimeIOException: Unrecoverable error while loading a tagger model
    at edu.stanford.nlp.pipeline.StanfordCoreNLP$4.create(StanfordCoreNLP.java:563)
    at edu.stanford.nlp.pipeline.AnnotatorPool.get(AnnotatorPool.java:81)
    at edu.stanford.nlp.pipeline.StanfordCoreNLP.construct(StanfordCoreNLP.java:262)
    at edu.stanford.nlp.pipeline.StanfordCoreNLP.<init>(StanfordCoreNLP.java:129)
    at edu.stanford.nlp.pipeline.StanfordCoreNLP.<init>(StanfordCoreNLP.java:125)
    at lemmafirst.<init>(lemmafirst.java:39)
    at lemmafirst.main(lemmafirst.java:83)
Caused by: edu.stanford.nlp.io.RuntimeIOException: Unrecoverable error while loading a tagger model
    at edu.stanford.nlp.tagger.maxent.MaxentTagger.readModelAndInit(MaxentTagger.java:758)
    at edu.stanford.nlp.tagger.maxent.MaxentTagger.<init>(MaxentTagger.java:289)
    at edu.stanford.nlp.tagger.maxent.MaxentTagger.<init>(MaxentTagger.java:253)
    at edu.stanford.nlp.pipeline.POSTaggerAnnotator.loadModel(POSTaggerAnnotator.java:88)
    at edu.stanford.nlp.pipeline.POSTaggerAnnotator.<init>(POSTaggerAnnotator.java:76)
    at edu.stanford.nlp.pipeline.StanfordCoreNLP$4.create(StanfordCoreNLP.java:561)
    ... 6 more
Caused by: java.io.IOException: Unable to resolve "edu/stanford/nlp/models/pos-tagger/english-left3words/english-left3words-distsim.tagger" as either class path, filename or URL
    at edu.stanford.nlp.io.IOUtils.getInputStreamFromURLOrClasspathOrFileSystem(IOUtils.java:434)
    at edu.stanford.nlp.tagger.maxent.MaxentTagger.readModelAndInit(MaxentTagger.java:753)
    ... 11 more
Run Code Online (Sandbox Code Playgroud)

有谁可以请更正错误?谢谢

Chr*_*der 14

引发的异常是由于缺少pos模型.这是因为有可下载的版本有和没有模型文件.

您可以添加stanford-postagger- full -3.3.1.jar,可在以下页面找到(stanford-postagger-full-2014-01-04.zip):http: //nlp.stanford.edu/software/ tagger.shtml .

或者你对整个CoreNLP包(stanford-corenlp- full .... jar)做同样的事情:http: //nlp.stanford.edu/software/corenlp.shtml (然后你也可以删除所有的postagger依赖性,他们包含在CoreNLP中)

如果您只想添加模型文件,请查看Maven Central并下载"stanford-corenlp-3.3.1-models.jar".

  • 简短回答:从这里下载完整的CoreNLP软件包,其中包括模型文件:http://nlp.stanford.edu/software/corenlp.shtml (3认同)

Sru*_*tur 7

添加这些模型文件的一种更简单的方法是在pom.xml中添加以下依赖项,让maven为您管理它:

<dependency>
  <groupId>edu.stanford.nlp</groupId>
  <artifactId>stanford-corenlp</artifactId>
  <version>3.6.0</version>
</dependency>
<dependency>
  <groupId>edu.stanford.nlp</groupId>
  <artifactId>stanford-corenlp</artifactId>
  <version>3.6.0</version>
  <classifier>models</classifier> <!--  will get the dependent model jars -->
</dependency>
Run Code Online (Sandbox Code Playgroud)

  • 谢谢@Sruthi Poddutur给你的评论.它有助于解决我的问题. (2认同)