我正在尝试跟随Abdi&Williams - Principal Component Analysis(2010)并通过SVD构建主要组件,使用numpy.linalg.svd.
当我components_从带有sklearn的拟合PCA 显示属性时,它们的大小与我手动计算的大小完全相同,但有些(不是全部)符号相反.是什么导致了这个?
更新:我的(部分)答案包含一些其他信息.
以下示例数据为例:
from pandas_datareader.data import DataReader as dr
import numpy as np
from sklearn.decomposition import PCA
from sklearn.preprocessing import scale
# sample data - shape (20, 3), each column standardized to N~(0,1)
rates = scale(dr(['DGS5', 'DGS10', 'DGS30'], 'fred',
start='2017-01-01', end='2017-02-01').pct_change().dropna())
# with sklearn PCA:
pca = PCA().fit(rates)
print(pca.components_)
[[-0.58365629 -0.58614003 -0.56194768]
[-0.43328092 -0.36048659 0.82602486]
[-0.68674084 0.72559581 -0.04356302]]
# compare to the manual method via SVD: …Run Code Online (Sandbox Code Playgroud)