有了prcomp()函数,我估计了方差百分比的解释
prcomp(env, scale=TRUE)
Run Code Online (Sandbox Code Playgroud)
第二列summary(pca)显示了所有PC的这些值:
PC1 PC2 PC3 PC4 PC5 PC6 PC7
Standard deviation 7.3712 5.8731 2.04668 1.42385 1.13276 0.79209 0.74043
Proportion of Variance 0.5488 0.3484 0.04231 0.02048 0.01296 0.00634 0.00554
Cumulative Proportion 0.5488 0.8972 0.93956 0.96004 0.97300 0.97933 0.98487
Run Code Online (Sandbox Code Playgroud)
现在,我想查找每台PC的特征值:
pca$sdev^2
[1] 5.433409e+01 3.449329e+01 4.188887e+00 2.027337e+00 1.283144e+00
[6] 6.274083e-01 5.482343e-01
Run Code Online (Sandbox Code Playgroud)
但是这些值似乎只是PVE本身的替代表示。那我在做什么错呢?
我不确定这是否是您的困惑。
pca$sdev^2 -> eigen values -> variance in each direction
pca$sdev^2/sum(pca$sdev^2) = proportion of variance vector
Run Code Online (Sandbox Code Playgroud)
因此它们是相关的。
编辑:只是一个示例(以说明这种关系),如果有帮助的话。
set.seed(45) # for reproducibility
# set a matrix with each column sampled from a normal distribution
# with same mean but different variances
m <- matrix(c(rnorm(200,2, 10), rnorm(200,2,10),
rnorm(200,2,10), rnorm(200,2,10)), ncol=4)
pca <- prcomp(m)
> summary(pca) # note that the variances here equal that of input
# all columns are independent of each other, so each should explain
# equal amount of variance (which is the case here). all are ~ 25%
PC1 PC2 PC3 PC4
Standard deviation 10.9431 10.6003 10.1622 9.3200
Proportion of Variance 0.2836 0.2661 0.2446 0.2057
Cumulative Proportion 0.2836 0.5497 0.7943 1.0000
> pca$sdev^2
# [1] 119.75228 112.36574 103.27063 86.86322
> pca$sdev^2/sum(pca$sdev^2)
# [1] 0.2836039 0.2661107 0.2445712 0.2057142
Run Code Online (Sandbox Code Playgroud)