无法从 Keras 导入 Tokenizer

rma*_*esh 3 python machine-learning deep-learning keras

目前正在研究深度学习示例,他们正在使用 Tokenizer 包。我收到以下错误:

AttributeError:“Tokenizer”对象没有属性“word_index”

这是我的代码:

from keras.preprocessing.text import Tokenizer

samples = ['The cat say on the mat.', 'The dog ate my homework.']

tokenizer = Tokenizer(num_words=1000)
tokenizer.fit_on_sequences(samples)

sequences = tokenizer.texts_to_sequences(samples)

one_hot_results = tokenizer.texts_to_matrix(samples, mode='binary')

word_index = tokenizer.word_index
print('Found %s unique tokens.' % len(word_index))
Run Code Online (Sandbox Code Playgroud)

谁能帮我发现我的错误吗?

Edw*_*ins 5

看来导入正确,但该Tokenizer对象没有 attribute word_index

根据文档fits_on_text,只有在调用对象的方法时才会设置属性Tokenizer

以下代码运行成功:

 from keras.preprocessing.text import Tokenizer

 samples = ['The cat say on the mat.', 'The dog ate my homework.']

 tokenizer = Tokenizer(num_words=1000)
 tokenizer.fit_on_texts(samples)

 one_hot_results = tokenizer.texts_to_matrix(samples, mode='binary')

 word_index = tokenizer.word_index
 print('Found %s unique tokens.' % len(word_index))
Run Code Online (Sandbox Code Playgroud)