我正在尝试运行以下命令
import nltk
nltk.download('all')
Run Code Online (Sandbox Code Playgroud)
但是我收到了这个错误
Traceback (most recent call last):
File "./update.py", line 3, in <module>
nltk.download('all')
File "/usr/lib/python3.6/site-packages/nltk/downloader.py", line 664, in download
for msg in self.incr_download(info_or_id, download_dir, force):
File "/usr/lib/python3.6/site-packages/nltk/downloader.py", line 534, in incr_download
try: info = self._info_or_id(info_or_id)
File "/usr/lib/python3.6/site-packages/nltk/downloader.py", line 508, in _info_or_id
return self.info(info_or_id)
File "/usr/lib/python3.6/site-packages/nltk/downloader.py", line 875, in info
self._update_index()
File "/usr/lib/python3.6/site-packages/nltk/downloader.py", line 825, in _update_index
ElementTree.parse(compat.urlopen(self._url)).getroot())
File "/usr/lib/python3.6/xml/etree/ElementTree.py", line 1196, in parse
tree.parse(source, parser)
File "/usr/lib/python3.6/xml/etree/ElementTree.py", line 597, in parse
self._root = parser._parse_whole(source)
xml.etree.ElementTree.ParseError: not well-formed (invalid token): line 23, column 143
Run Code Online (Sandbox Code Playgroud)
我是python的新手,所以我不确定我该怎么办.我查看了上面报告的源模块,发现它正在尝试下载xml文件.所以我运行下面的命令,并没有给我任何错误.
compat.urlopen('https://raw.githubusercontent.com/nltk/nltk_data/gh-pages/index.xml')
Run Code Online (Sandbox Code Playgroud)
所以我认为下载中没有问题,但在解析器中.有人可以建议我如何从这里开始?
小智 1
问题出在 NLTK 返回的 XML 上。
xml.etree.ElementTree.ParseError: not well-formed (invalid token): line 23, column 143
Run Code Online (Sandbox Code Playgroud)
在 23:143 处,我们看到了问题,缺少一个“=”:
... unzip="1" unzipped_size"1917" url="https...
Run Code Online (Sandbox Code Playgroud)
NTLK 肯定会很快解决这个问题,在那之前我不确定最好的回应是什么。