SHAP - 具有多个维度的实例

opp*_*ric 2 python machine-learning shap

我对 SHAP 很陌生,我想尝试一下,但遇到了一些困难。

该模型已经过训练并且似乎表现良好。然后我使用训练数据来测试 SHAP。看起来像这样:

   var_Braeburn  var_Cripps Pink  var_Dazzle  var_Fuji  var_Granny Smith  \
0             1                0           0         0                 0   
1             0                1           0         0                 0   
2             0                1           0         0                 0   
3             0                1           0         0                 0   
4             0                1           0         0                 0   

   var_Other Variety  var_Royal Gala (Tenroy)  root_CG202  root_M793  \
0                  0                        0           0          0   
1                  0                        0           1          0   
2                  0                        0           1          0   
3                  0                        0           0          0   
4                  0                        0           0          0   

   root_MM106  ...  frt_BioRich Organic Compost_single  \
0           1  ...                                   0   
1           0  ...                                   0   
2           0  ...                                   0   
3           1  ...                                   0   
4           1  ...                                   0   

   frt_Biomin Boron_single  frt_Biomin Zinc_single  \
0                        0                       1   
1                        0                       0   
2                        0                       0   
3                        0                       0   
4                        0                       0   

   frt_Fertco Brimstone90 sulphur_single  frt_Fertco Guano _single  \
0                                      0                         0   
1                                      0                         0   
2                                      0                         0   
3                                      0                         0   
4                                      0                         0   

   frt_Gro Mn_multiple  frt_Gro Mn_single  frt_Organic Mag Super_multiple  \
0                    0                  0                               0   
1                    1                  0                               1   
2                    1                  0                               1   
3                    1                  0                               1   
4                    1                  0                               1   

   frt_Organic Mag Super_single  frt_Other Fertiliser  
0                             0                     0  
1                             0                     0  
2                             0                     0  
3                             0                     0  
4                             0                     0 
Run Code Online (Sandbox Code Playgroud)

然后我做explainer = shap.Explainer(model)并且shap_values = explainer(X_train)

这运行没有错误并shap_values给我这个:

.values =
array([[[ 0.00775555, -0.00775555],
        [-0.03221035,  0.03221035],
        [-0.0027203 ,  0.0027203 ],
        ...,
        [ 0.00259787, -0.00259787],
        [-0.00459262,  0.00459262],
        [-0.0303394 ,  0.0303394 ]],

       [[-0.00068313,  0.00068313],
        [-0.03006355,  0.03006355],
        [-0.00245706,  0.00245706],
        ...,
        [-0.00418809,  0.00418809],
        [-0.00088372,  0.00088372],
        [-0.00030019,  0.00030019]],

       [[-0.00068313,  0.00068313],
        [-0.03006355,  0.03006355],
        [-0.00245706,  0.00245706],
        ...,
        [-0.00418809,  0.00418809],
        [-0.00088372,  0.00088372],
        [-0.00030019,  0.00030019]],

       ...,
Run Code Online (Sandbox Code Playgroud)

但是,当我运行时shap.plots.beeswarm(shap_values),出现以下错误:

ValueError: The beeswarm plot does not support plotting explanations with instances that have more than one dimension!

我在这里做错了什么?

Ser*_*nov 6

尝试这个:

from shap import Explainer
from shap.plots import beeswarm
from sklearn.ensemble import RandomForestClassifier
from sklearn.datasets import load_breast_cancer

X, y = load_breast_cancer(return_X_y=True, as_frame = True)
model = RandomForestClassifier().fit(X, y)

explainer = Explainer(model)
sv = explainer(X)
Run Code Online (Sandbox Code Playgroud)

然后,由于 RF 有点特殊,因此仅检索类别 1 的 shap 值:

beeswarm(sv[:,:,1])
Run Code Online (Sandbox Code Playgroud)

在此输入图像描述