无法从 Keras 导入 Tokenizer

Question

无法从 Keras 导入 Tokenizer

rma*_*esh 3 python machine-learning deep-learning keras

目前正在研究深度学习示例，他们正在使用 Tokenizer 包。我收到以下错误：

AttributeError：“Tokenizer”对象没有属性“word_index”

这是我的代码：

from keras.preprocessing.text import Tokenizer

samples = ['The cat say on the mat.', 'The dog ate my homework.']

tokenizer = Tokenizer(num_words=1000)
tokenizer.fit_on_sequences(samples)

sequences = tokenizer.texts_to_sequences(samples)

one_hot_results = tokenizer.texts_to_matrix(samples, mode='binary')

word_index = tokenizer.word_index
print('Found %s unique tokens.' % len(word_index))

Run Code Online (Sandbox Code Playgroud)

谁能帮我发现我的错误吗？

Answer 1

Edw*_*ins 5

看来导入正确，但该Tokenizer对象没有 attribute word_index。

根据文档fits_on_text，只有在调用对象的方法时才会设置属性Tokenizer。

以下代码运行成功：

 from keras.preprocessing.text import Tokenizer

 samples = ['The cat say on the mat.', 'The dog ate my homework.']

 tokenizer = Tokenizer(num_words=1000)
 tokenizer.fit_on_texts(samples)

 one_hot_results = tokenizer.texts_to_matrix(samples, mode='binary')

 word_index = tokenizer.word_index
 print('Found %s unique tokens.' % len(word_index))

Run Code Online (Sandbox Code Playgroud)

归档时间：	8 年前
查看次数：	16198 次
最近记录：	8 年前