Python 中的 PLS-DA 加载图

sim*_*nes 7 python matplotlib scikit-learn

如何使用 PLS-DA 图的 Matplotlib 制作加载图,例如 PCA 的加载图?

这个答案解释了如何使用 PCA 完成: Plot PCA loadings and loading in biplot in sklearn (like R's autoplot)

然而,这两种方法之间存在一些显着差异,这使得实现也不同。(此处解释了一些相关差异https://learnche.org/pid/latent-variable-modelling/projection-to-latent-structures/interpreting-pls-scores-and-loadings

为了制作 PLS-DA 图,我使用以下代码:

from sklearn.preprocessing import StandardScaler
from sklearn.cross_decomposition import PLSRegression
import numpy as np
import pandas as pd

targets = [0, 1]

x_vals = StandardScaler().fit_transform(df.values)

y = [g == targets[0] for g in sample_description]
y = np.array(y, dtype=int)

plsr = PLSRegression(n_components=2, scale=False)
plsr.fit(x_vals, y)

colormap = {
    targets[0]: '#ff0000',  # Red
    targets[1]: '#0000ff',  # Blue
}

colorlist = [colormap[c] for c in sample_description]

scores = pd.DataFrame(plsr.x_scores_)
scores.index = x.index

x_loadings = plsr.x_loadings_
y_loadings = plsr.y_loadings_

fig1, ax = get_default_fig_ax('Scores on LV 1', 'Scores on LV 2', title)
ax = scores.plot(x=0, y=1, kind='scatter', s=50, alpha=0.7,
                 c=colorlist, ax=ax)
Run Code Online (Sandbox Code Playgroud)

Ggj*_*j11 1

我拿走了你的代码并对其进行了增强。双图是通过简单地叠加分数和加载图来获得的。根据https://blogs.sas.com/content/iml/2019/11/06/what-are-biplots.html#:~:text=A%20biplot ,可以使用真正共享的轴制作其他更严格的绘图%20是%20an%20叠加,他们%20在%20a%20single%20plot上

下面的代码为具有约 200 个特征的数据集生成此图像(因此显示了约 200 个红色箭头): 具有重叠轴的双图,加载图的轴被隐藏,并且不随加载图的轴缩放

from sklearn.cross_decomposition import PLSRegression
pls2 = PLSRegression(n_components=2)
pls2.fit(X_train, Y_train)

x_loadings = pls2.x_loadings_
y_loadings = pls2.y_loadings_

fig, ax = plt.subplots(constrained_layout=True)

scores = pd.DataFrame(pls2.x_scores_)
scores.plot(x=0, y=1, kind='scatter', s=50, alpha=0.7,
                 c=Y_train.values[:,0], ax = ax)


newax = fig.add_axes(ax.get_position(), frameon=False)
feature_n=x_loadings.shape[0]
print(x_loadings.shape)
for feature_i in range(feature_n):
    comp_1_idx=0
    comp_2_idx=1
    newax.arrow(0, 0, x_loadings[feature_i,comp_1_idx], x_loadings[feature_i,comp_2_idx],color = 'r',alpha = 0.5)
newax.get_xaxis().set_visible(False)
newax.get_yaxis().set_visible(False)

plt.show()
Run Code Online (Sandbox Code Playgroud)