Kas*_*mvd 10 python nlp nltk stanford-nlp python-textprocessing
我正在尝试nltk.tag.stanford module 用来标记一个句子(首先像wiki的例子),但我一直收到以下错误:
Traceback (most recent call last):
File "test.py", line 28, in <module>
print st.tag(word_tokenize('What is the airspeed of an unladen swallow ?'))
File "/usr/local/lib/python2.7/dist-packages/nltk/tag/stanford.py", line 59, in tag
return self.tag_sents([tokens])[0]
File "/usr/local/lib/python2.7/dist-packages/nltk/tag/stanford.py", line 81, in tag_sents
stdout=PIPE, stderr=PIPE)
File "/usr/local/lib/python2.7/dist-packages/nltk/internals.py", line 160, in java
raise OSError('Java command failed!')
OSError: Java command failed!
Run Code Online (Sandbox Code Playgroud)
或以下LookupError错误:
LookupError:
===========================================================================
NLTK was unable to find the java file!
Use software specific configuration paramaters or set the JAVAHOME environment variable.
===========================================================================
Run Code Online (Sandbox Code Playgroud)
这是exapmle代码:
>>> from nltk.tag.stanford import POSTagger
>>> st = POSTagger('/usr/share/stanford-postagger/models/english-bidirectional-distsim.tagger',
... '/usr/share/stanford-postagger/stanford-postagger.jar')
>>> st.tag('What is the airspeed of an unladen swallow ?'.split())
Run Code Online (Sandbox Code Playgroud)
我还用word_tokenize代替split,但它并没有作出任何区别.
我还安装了java或者jdk!而我的所有搜索都没有成功!类似nltknltk.internals.config_java()或......!
注意:我使用linux(Xubuntu)!
如果您阅读nltk/internals.py中的嵌入式文档(第58-175行),您应该可以轻松找到答案.NLTK需要Java二进制文件的完整路径.
如果未指定,则nltk将在系统中搜索Java二进制文件; 如果找不到,则会引发LookupError异常.
根据一些研究,你有几个我相信的选择:
1)将以下代码添加到您的项目中(不是一个很好的解决方案)
import os
java_path = "path/to/java" # replace this
os.environ['JAVAHOME'] = java_path
Run Code Online (Sandbox Code Playgroud)
2)卸载并重新安装NLTK(最好是在virtualenv中)(更好但仍然不是很好)
pip uninstall nltk
sudo -E pip install nltk
Run Code Online (Sandbox Code Playgroud)
3)设置java环境变量(这是最实用的解决方案IMO)
编辑系统路径文件/ etc/profile
sudo gedit /etc/profile
Run Code Online (Sandbox Code Playgroud)
最后添加以下行
JAVA_HOME=/usr/lib/jvm/jdk1.7.0
PATH=$PATH:$HOME/bin:$JAVA_HOME/bin
export JAVA_HOME
export JRE_HOME
export PATH
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
7176 次 |
| 最近记录: |