Fri*_*ten 2 python nlp text-mining
我有以下代码:
train_set = ("The sky is blue.", "The sun is bright.")
test_set = ("The sun in the sky is bright.",
"We can see the shining sun, the bright sun.")
Run Code Online (Sandbox Code Playgroud)
现在我试图计算这样的词频:
from sklearn.feature_extraction.text import CountVectorizer
vectorizer = CountVectorizer()
Run Code Online (Sandbox Code Playgroud)
接下来我想打印词汇表。因此我这样做:
vectorizer.fit_transform(train_set)
print vectorizer.vocabulary
Run Code Online (Sandbox Code Playgroud)
现在我得到的输出没有。虽然我期待这样的事情:
{'blue': 0, 'sun': 1, 'bright': 2, 'sky': 3}
Run Code Online (Sandbox Code Playgroud)
任何想法哪里出了问题?