use*_*704 20 python matplotlib pca
代码:
import numpy
from matplotlib.mlab import PCA
file_name = "store1_pca_matrix.txt"
ori_data = numpy.loadtxt(file_name,dtype='float', comments='#', delimiter=None, converters=None, skiprows=0, usecols=None, unpack=False, ndmin=0)
result = PCA(ori_data)
Run Code Online (Sandbox Code Playgroud)
这是我的代码.虽然我的输入矩阵没有nan和inf,但我确实得到了下面所述的错误.
raise LinAlgError("SVD did not converge") LinAlgError: SVD did not converge
Run Code Online (Sandbox Code Playgroud)
有什么问题?
jse*_*old 30
当数据中存在inf或nan值时,可能会发生这种情况.
使用它来删除nan值:
ori_data.dropna(inplace=True)
Run Code Online (Sandbox Code Playgroud)
小智 15
I know this post is old, but in case someone else encounters the same problem. @jseabold was right when he said that the problem is nan or inf and the op was probably right when he said that the data did not have nan's or inf. However, if one of the columns in ori_data has always the same value, the data will get Nans, since the implementation of PCA in mlab normalizes the input data by doing
ori_data = (ori_data - mean(ori_data)) / std(ori_data).
Run Code Online (Sandbox Code Playgroud)
The solution is to do:
result = PCA(ori_data, standardize=False)
Run Code Online (Sandbox Code Playgroud)
In this way, only the mean will be subtracted without dividing by the standard deviation.
我没有这个问题的答案,但我有没有nans和infs的复制场景.不幸的是,datataset相当大(96MB gzip).
import numpy as np
from StringIO import StringIO
from scipy import linalg
import urllib2
import gzip
url = 'http://physics.muni.cz/~vazny/gauss/X.gz'
X = np.loadtxt(gzip.GzipFile(fileobj=StringIO(urllib2.urlopen(url).read())), delimiter=',')
linalg.svd(X, full_matrices=False)
Run Code Online (Sandbox Code Playgroud)
哪个上升:
LinAlgError: SVD did not converge
Run Code Online (Sandbox Code Playgroud)
上:
>>> np.__version__
'1.8.1'
>>> import scipy
>>> scipy.__version__
'0.10.1'
Run Code Online (Sandbox Code Playgroud)
但没有引起异常:
>>> np.__version__
'1.8.2'
>>> import scipy
>>> scipy.__version__
'0.14.0'
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
40398 次 |
| 最近记录: |