我试图pos_tag在NLTK 3中使用该功能(在Windows上),但是这个错误突然出现了:
>>> import nltk
>>> tokens = nltk.word_tokenize("This is a sentence!")
>>> tokens
['This', 'is', 'a', 'sentence', '!']
>>> tags = nltk.pos_tag(tokens)
Traceback (most recent call last):
  File "<pyshell#24>", line 1, in <module>
    tags = nltk.pos_tag(tokens)
  File "C:\Users\Gebruiker\AppData\Local\Programs\Python\Python35-32\lib\site-packages\nltk\tag\__init__.py", line 110, in pos_tag
    tagger = PerceptronTagger()
  File "C:\Users\Gebruiker\AppData\Local\Programs\Python\Python35-32\lib\site-packages\nltk\tag\perceptron.py", line 141, in __init__
    self.load(AP_MODEL_LOC)
  File "C:\Users\Gebruiker\AppData\Local\Programs\Python\Python35-32\lib\site-packages\nltk\tag\perceptron.py", line 209, in load
    self.model.weights, self.tagdict, self.classes = load(loc)
  File "C:\Users\Gebruiker\AppData\Local\Programs\Python\Python35-32\lib\site-packages\nltk\data.py", line 801, in load
    opened_resource = _open(resource_url)
  File "C:\Users\Gebruiker\AppData\Local\Programs\Python\Python35-32\lib\site-packages\nltk\data.py", line 924, in …我有一个数据集,其中一列的标题是"你的位置和时区是什么?"
这意味着我们有像这样的条目
乃至
有没有办法从中提取城市,国家和时区?
我想所有的国家名称(包括缩写形式)以及城市名称/时区和创建数组(从一个开源的数据集)的,然后如果在数据集中的任何字与一个城市/国家/时区匹配或简短表单将它填入同一数据集中的新列并对其进行计数.
这有用吗?
===========基于NLTK答案的REPLT ============
运行与Alecxe相同的代码
Traceback (most recent call last):
  File "E:\SBTF\ntlk_test.py", line 19, in <module>
    tagged_sentences = [nltk.pos_tag(sentence) for sentence in tokenized_sentences]
  File "C:\Python27\ArcGIS10.4\lib\site-packages\nltk\tag\__init__.py", line 110, in pos_tag
    tagger = PerceptronTagger()
  File "C:\Python27\ArcGIS10.4\lib\site-packages\nltk\tag\perceptron.py", line 141, in __init__
    self.load(AP_MODEL_LOC)
  File "C:\Python27\ArcGIS10.4\lib\site-packages\nltk\tag\perceptron.py", line 209, in load
    self.model.weights, self.tagdict, self.classes = load(loc)
  File "C:\Python27\ArcGIS10.4\lib\site-packages\nltk\data.py", line 801, in load
    opened_resource = _open(resource_url)
  File "C:\Python27\ArcGIS10.4\lib\site-packages\nltk\data.py", line 924, in …