矢量化器 fit_transform 在 sklearn 中如何工作？

Question

矢量化器 fit_transform 在 sklearn 中如何工作？

Leo*_*eo 11 python machine-learning scikit-learn

我试图理解下面的代码

from sklearn.feature_extraction.text import CountVectorizer 

vectorizer = CountVectorizer() 

corpus = ['This is the first document.','This is the second second document.','And the third one.','Is this the first document?'] 

X = vectorizer.fit_transform(corpus)

Run Code Online (Sandbox Code Playgroud)

当我尝试打印 X 以查看将返回什么时，我得到了以下结果：

(0, 1)  1

(0, 2)  1

(0, 6)  1

(0, 3)  1

(0, 8)  1

(1, 5)  2

(1, 1)  1

(1, 6)  1

(1, 3)  1

(1, 8)  1

(2, 4)  1

(2, 7)  1

(2, 0)  1

(2, 6)  1

(3, 1)  1

(3, 2)  1

(3, 6)  1

(3, 3)  1

(3, 8)  1

Run Code Online (Sandbox Code Playgroud)

但是我不明白这个结果的含义？

Answer 1

小智 -2

它将文本转换为数字。因此，使用其他函数，您将能够计算每个单词在给定数据集中存在的次数。我对编程很陌生，所以也许还有其他领域可以使用。

归档时间：	8 年，2 月前
查看次数：	22236 次
最近记录：	7 年前