我正在使用scipy在一些大数据上做稀疏矩阵svd.matix大小约为200,000*8,000,000,非零条目为1.19%.我使用的机器有160G内存,所以我想内存应该不是问题.
所以这里是我使用的一些代码:
from scipy import *
from scipy.sparse import *
import scipy.sparse.linalg as slin
from numpy import *
K=1500
coom=coo_matrix((value,(row,col)),shape=(M,N))
coom=coom.astype('float32')
u,s,v=slin.svds(coom,K,ncv=8*K)
Run Code Online (Sandbox Code Playgroud)
错误消息如下:
Traceback (most recent call last):
File "sparse_svd.py", line 35, in <module>
u,s,v=slin.svds(coom,K,ncv=2*K+1)
File "/usr/lib/python2.7/dist-packages/scipy/sparse/linalg/eigen/arpack/arpack.py", line 731, in svds
eigvals, eigvec = eigensolver(XH_X, k=k, tol=tol**2)
File "/usr/lib/python2.7/dist-packages/scipy/sparse/linalg/eigen/arpack/arpack.py", line 680, in eigsh
params.iterate()
File "/usr/lib/python2.7/dist-packages/scipy/sparse/linalg/eigen/arpack/arpack.py", line 278, in iterate
raise ArpackError(self.info)
scipy.sparse.linalg.eigen.arpack.arpack.ArpackError: ARPACK error 3: No shifts could be applied during a cycle of the Implicitly restarted Arnoldi iteration. …Run Code Online (Sandbox Code Playgroud)