小编Sau*_*abh的帖子

将scipy.sparse.csr.csr_matrix转换为列表列表

我正在学习多标签分类，并尝试从scikit学习中实施tfidf教程。我正在处理文本语料库以计算其tf-idf分数。我为此目的使用模块sklearn.feature_extraction.text。使用CountVectorizer和TfidfTransformer，现在我为每个词汇集了语料库矢量和tfidf。问题是我现在有一个稀疏矩阵，例如：

(0, 47) 0.104275891915
(0, 383)    0.084129133023
.
.
.
.
(4, 308)    0.0285015996586
(4, 199)    0.0285015996586

Run Code Online (Sandbox Code Playgroud)

我想将此sparse.csr.csr_matrix转换为列表列表，以便可以摆脱上述csr_matrix的文档ID，并获得tfidf和vocabularyId对，例如

47:0.104275891915 383:0.084129133023
.
.
.
.
308:0.0285015996586 
199:0.0285015996586

Run Code Online (Sandbox Code Playgroud)

有什么方法可以转换为列表列表，或者可以通过其他方式更改格式以获得tfidf-vocabularyId对吗？

python machine-learning scipy tf-idf scikit-learn

Sau*_*abh

2016 11-20

4
推荐指数

1
解决办法

6251
查看次数

标签统计

machine-learning ×1

python ×1

scikit-learn ×1

scipy ×1

tf-idf ×1

将scipy.sparse.csr.csr_matrix转换为列表列表

标签 统计

小编Sau_abh的帖子

标签统计