为StanfordCoreNLP指定模型的路径

Ast*_*arp 3 .net ikvm stanford-nlp

我正在使用IKVM.NET使用StandordCoreNLP.有没有办法指定解析器模型的路径

   var pipeLine = new StanfordCoreNLP(props);
Run Code Online (Sandbox Code Playgroud)

抛出异常:

java.lang.RuntimeException: java.io.IOException: Unable to resolve
"edu/stanford/nlp/models/pos-tagger/english-left3words/english-left3words-distsim.tagger"
as either class path, filename or URL
Run Code Online (Sandbox Code Playgroud)

小智 7

如果您未在类路径中包含models.jar,那么这是完整的属性集.

Properties props = new Properties();
String modPath = "<YOUR PATH TO MODELS>/models3.4/edu/stanford/nlp/models/";
props.put("pos.model", modPath + "pos-tagger/english-left3words/english-left3words-distsim.tagger");
props.put("ner.model", modPath + "ner/english.all.3class.distsim.crf.ser.gz");
props.put("parse.model", modPath + "lexparser/englishPCFG.ser.gz");
props.put("annotators", "tokenize, ssplit, pos, lemma, ner, parse, dcoref");
props.put("sutime.binders","0");
props.put("sutime.rules", modPath + "sutime/defs.sutime.txt, " + modPath + "sutime/english.sutime.txt");
props.put("dcoref.demonym", modPath + "dcoref/demonyms.txt");
props.put("dcoref.states", modPath + "dcoref/state-abbreviations.txt");
props.put("dcoref.animate", modPath + "dcoref/animate.unigrams.txt");
props.put("dcoref.inanimate", modPath + "dcoref/inanimate.unigrams.txt");
props.put("dcoref.big.gender.number", modPath + "dcoref/gender.data.gz");
props.put("dcoref.countries", modPath + "dcoref/countries");
props.put("dcoref.states.provinces", modPath + "dcoref/statesandprovinces");
props.put("dcoref.singleton.model", modPath + "dcoref/singleton.predictor.ser");
Run Code Online (Sandbox Code Playgroud)


Jon*_*est 6

看看你如何定义属性会很有帮助.如果您使用的默认属性,你可能只是缺少models.jar(像这样一个 3.2版本)在类路径中.下载它并确保它被加载.

如果以其他方式配置属性,则字符串中可能存在导致IO错误的语法错误.这是我加载不同pos.model外观的自定义属性:

Properties props = new Properties();
// using wsj-bidirectional model
props.put("pos.model", "edu/stanford/nlp/models/pos-tagger/wsj-bidirectional/wsj-0-18-bidirectional-distsim.tagger");
// using standard pipeline
props.put("annotators", "tokenize, ssplit, pos, lemma, parse");
// create pipeline
StanfordCoreNLP pipeline = new StanfordCoreNLP(props);
Run Code Online (Sandbox Code Playgroud)

重要的是要注意/路径中没有前导斜线.

如果这没有帮助,请查看Galal Aly的教程,其中标记器从模型文件中提取并单独加载.