为什么从 NLTK 中导入 word_tokenize 在解释器中有效，但在我的脚本中无效？

Question

为什么从 NLTK 中导入 word_tokenize 在解释器中有效，但在我的脚本中无效？

Gad*_*t.s 2 python import nlp namespaces nltk

我正在尝试使用 nltk 标记一个句子。当我通过 python shell 执行此操作时，我得到了正确的答案。

>>> import nltk
>>> sentence = "Mohanlal made his acting debut in Thiranottam (1978), but the film got released only after 25 years due to censorship issues."
>>> tokens = nltk.word_tokenize(sentence)
>>> tokens
['Mohanlal', 'made', 'his', 'acting', 'debut', 'in', 'Thiranottam', '(', '1978', ')', ',', 'but', 'the', 'film', 'got', 'released', 'only', 'after', '25', 'years', 'due', 'to', 'censorship', 'issues', '.']

Run Code Online (Sandbox Code Playgroud)

但是当我在文件中编写相同的代码并尝试运行它时，出现以下错误。

    Traceback (most recent call last):
  File "tokenize.py", line 1, in <module>
    import nltk
  File "/usr/local/lib/python2.7/dist-packages/nltk/__init__.py", line 114, in <module>
    from nltk.collocations import *
  File "/usr/local/lib/python2.7/dist-packages/nltk/collocations.py", line 38, in <module>
    from nltk.util import ngrams
  File "/usr/local/lib/python2.7/dist-packages/nltk/util.py", line 13, in <module>
    import pydoc
  File "/usr/lib/python2.7/pydoc.py", line 55, in <module>
    import sys, imp, os, re, types, inspect, __builtin__, pkgutil, warnings
  File "/usr/lib/python2.7/inspect.py", line 39, in <module>
    import tokenize
  File "/home/gadheyan/Project/Codes/tokenize.py", line 2, in <module>
    from nltk import word_tokenize
ImportError: cannot import name word_tokenize

Run Code Online (Sandbox Code Playgroud)

这是我运行的代码。

import nltk
from nltk import word_tokenize

sentence = "Mohanlal made his acting debut in Thiranottam (1978), but the film got released only after 25 years due to censorship issues."
tokens = nltk.word_tokenize(sentence)
print tokens

Run Code Online (Sandbox Code Playgroud)

Answer 1

alv*_*vas 5

长话短说

这是一个命名问题，请参阅Python failed to `import nltk` in my script but Works in theterpreter

将您的文件重命名为my_tokenize.py而不是tokenize.py，即

$ mv /home/gadheyan/Project/Codes/tokenize.py /home/gadheyan/Project/Codes/my_tokenize.py
$ python my_tokenize.py

Run Code Online (Sandbox Code Playgroud)

长：

从你的回溯中，你会看到：

File "/usr/lib/python2.7/inspect.py", line 39, in <module>
    import tokenize
  File "/home/gadheyan/Project/Codes/tokenize.py", line 2, in <module>
    from nltk import word_tokenize

Run Code Online (Sandbox Code Playgroud)

在NLTK中，有一个包调用nltk.tokenize所在 nltk.word_tokenize，http://www.nltk.org/_modules/nltk/tokenize.html

因此，当您的脚本名称为 astokenize.py时，当您调用nltk.word_tokenize时，当它进入 nltk 并尝试导入时nltk.tokenize，它会导入您的脚本 ( /home/gadheyan/Project/Codes/tokenize.py) 而不是nltk.tokenize因为inspect.py使用本地命名空间

顺便提一句

冗余命名空间在 python 中仍然可以工作，但最好保持命名空间和全局变量干净，即使用：

alvas@ubi:~$ python
Python 2.7.6 (default, Jun 22 2015, 17:58:13) 
[GCC 4.8.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> from nltk import word_tokenize
>>> sent = 'this is a foo bar sentence'
>>> word_tokenize(sent)
['this', 'is', 'a', 'foo', 'bar', 'sentence']
>>> exit()

Run Code Online (Sandbox Code Playgroud)

或这个：

alvas@ubi:~$ python
Python 2.7.6 (default, Jun 22 2015, 17:58:13) 
[GCC 4.8.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import nltk
>>> sent = 'this is a foo bar sentence'
>>> nltk.word_tokenize(sent)
['this', 'is', 'a', 'foo', 'bar', 'sentence']
>>> exit()

Run Code Online (Sandbox Code Playgroud)

但尽量避免这种情况（尽管无论如何它仍然有效）：

alvas@ubi:~$ python
Python 2.7.6 (default, Jun 22 2015, 17:58:13) 
[GCC 4.8.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import nltk
>>> from nltk import word_tokenize
>>> sent = 'this is a foo bar sentence'
>>> word_tokenize(sent)
['this', 'is', 'a', 'foo', 'bar', 'sentence']
>>> nltk.word_tokenize(sent)
['this', 'is', 'a', 'foo', 'bar', 'sentence']
>>> exit()

Run Code Online (Sandbox Code Playgroud)

归档时间：	10 年，3 月前
查看次数：	15631 次
最近记录：	10 年，3 月前