SHAP 异常:TreeExplainer 中的可加性检查失败

bho*_*sad 5 python machine-learning shap

我试图为本地解释的单行创建形状值,但我一直收到此错误。我尝试了各种方法但仍然无法修复它们。

到目前为止我所做的事情 -

创建了随机决策树模型 -

from sklearn.ensemble import ExtraTreesRegressor
extra_tree = ExtraTreesRegressor(random_state=42)
extra_tree.fit(X_train, y_train)
Run Code Online (Sandbox Code Playgroud)

然后尝试计算形状值 -

# create a explainer object
explainer = shap.Explainer(extra_tree)    
explainer.expected_value
array([15981.25812347])

#calculate shap value for a single row
shap_values = explainer.shap_values(pd.DataFrame(X_train.iloc[9274]).T)
Run Code Online (Sandbox Code Playgroud)

这给了我这个错误 -

Exception: Additivity check failed in TreeExplainer! Please ensure the data matrix you passed to the explainer is the same shape that the model was trained on. If your data shape is correct then please report this on GitHub. Consider retrying with the feature_perturbation='interventional' option. This check failed because for one of the samples the sum of the SHAP values was 25687017588058.968750, while the model output was 106205.580000. If this difference is acceptable you can set check_additivity=False to disable this check.
Run Code Online (Sandbox Code Playgroud)

训练的形状和我通过的单行具有相同的列数

X_train.shape
(421570, 164)
(pd.DataFrame(X_train.iloc[9274]).T).shape
(1, 164)
Run Code Online (Sandbox Code Playgroud)

我认为这不会造成任何问题。但为了确保这一点,我还尝试使用重塑方法来获得正确的形状。

shap_values = explainer.shap_values(X_train.iloc[9274].values.reshape(1, -1))

X_train.iloc[9274].values.reshape(1, -1).shape
(1, 164)
Run Code Online (Sandbox Code Playgroud)

这也不能解决问题。所以,我想也许我还需要匹配行数。所以我创建了一个小数据框并尝试测试它。

train = pd.concat([X_train, y_train], axis="columns")
train_small = train.sample(n=500, random_state=42)
X_train_small = train_small.drop("Weekly_Sales", axis=1).copy()
y_train_small = train_small["Weekly_Sales"].copy()

# train a randomized decision tree model
from sklearn.ensemble import ExtraTreesRegressor
extra_tree_small = ExtraTreesRegressor(random_state=42)
extra_tree_small.fit(X_train_small, y_train_small)

# create a explainer object
explainer = shap.Explainer(extra_tree_small)
shap_values = explainer.shap_values(X_train_small)

# I also tried to add the y value like this 
shap_values = explainer.shap_values(X_train_small, y_train_small)
Run Code Online (Sandbox Code Playgroud)

但没有任何效果。

GitHub 上的一位人士建议从 GitHub 卸载并重新安装 shap 的最新版本:

pip install git+https://github.com/slundberg/shap.git
Run Code Online (Sandbox Code Playgroud)

也试过了还是不行。

如何解决这个问题呢?

Abh*_*pat 0

尝试直接致电解释员

explainer = shap.Explainer(model)
shap_values = explainer(X)
Run Code Online (Sandbox Code Playgroud)

这里 X 是你的行。