小编kha*_*san的帖子

使用sklearn在大型稀疏矩阵上执行PCA

我试图在巨大的稀疏矩阵上应用PCA,在下面的链接中它说sklearn的randomizedPCA可以处理scipy稀疏格式的稀疏矩阵. 在非常大的稀疏矩阵上应用PCA

但是,我总是得到错误.有人可以指出我做错了什么.

输入矩阵'X_train'包含float64中的数字:

>>>type(X_train)
<class 'scipy.sparse.csr.csr_matrix'>
>>>X_train.shape
(2365436, 1617899)
>>>X_train.ndim 
2
>>>X_train[0]     
<1x1617899 sparse matrix of type '<type 'numpy.float64'>'
    with 81 stored elements in Compressed Sparse Row format>
Run Code Online (Sandbox Code Playgroud)

我想做:

>>>from sklearn.decomposition import RandomizedPCA
>>>pca = RandomizedPCA()
>>>pca.fit(X_train)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/RT11/.pyenv/versions/2.7.9/lib/python2.7/site-packages/sklearn/decomposition/pca.py", line 567, in fit
    self._fit(check_array(X))
  File "/home/RT11/.pyenv/versions/2.7.9/lib/python2.7/site-packages/sklearn/utils/validation.py", line 334, in check_array
    copy, force_all_finite)
  File "/home/RT11/.pyenv/versions/2.7.9/lib/python2.7/site-packages/sklearn/utils/validation.py", line 239, in _ensure_sparse_format
    raise TypeError('A sparse matrix was passed, but dense '
TypeError: …
Run Code Online (Sandbox Code Playgroud)

python sparse-matrix svd pca scikit-learn

14
推荐指数
2
解决办法
1万
查看次数

标签 统计

pca ×1

python ×1

scikit-learn ×1

sparse-matrix ×1

svd ×1