小编Joe*_*Joe的帖子

生成 shap 值后使用 shap.plots.waterfall 时出现错误

对于下面给出的代码,如果我只使用命令,shap.plots.waterfall(shap_values[6])我会收到错误

“numpy.ndarray”对象没有属性“base_values”

我必须首先运行这两个命令:

explainer2 = shap.Explainer(clf.best_estimator_.predict, X_train)
shap_values = explainer2(X_train)
Run Code Online (Sandbox Code Playgroud)

然后运行waterfall命令以获得正确的绘图。下面是发生错误的示例:

from sklearn.datasets import make_classification
import seaborn as sns
import numpy as np
import pandas as pd
from matplotlib import pyplot as plt
import pickle
import joblib
import warnings
import shap
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import RandomizedSearchCV, GridSearchCV

f, (ax1,ax2) = plt.subplots(nrows=1, ncols=2,figsize=(20,8))
# Generate noisy Data
X_train,y_train = make_classification(n_samples=1000, 
                          n_features=50, 
                          n_informative=9, 
                          n_redundant=0, 
                          n_repeated=0, 
                          n_classes=10, 
                          n_clusters_per_class=1,
                          class_sep=9,
                          flip_y=0.2,
                          #weights=[0.5,0.5], 
                          random_state=17)

X_test,y_test = make_classification(n_samples=500, 
                          n_features=50, …
Run Code Online (Sandbox Code Playgroud)

waterfall python-3.x shap

8
推荐指数
1
解决办法
7222
查看次数

shap.TreeExplainer 和 shap.Explainer 条形图之间的区别

对于下面给出的代码,我得到了不同的值条形图shap

在此示例中,我有一个train包含 1000 个样本、9 个类别和 500 个test样本的数据集。然后,我使用随机森林作为分类器并生成模型。当我开始生成条形图时,shap我在这两种情况下得到不同的结果:

shap_values_Tree_tr = shap.TreeExplainer(clf.best_estimator_).shap_values(X_train)
shap.summary_plot(shap_values_Tree_tr, X_train)
Run Code Online (Sandbox Code Playgroud)

在此输入图像描述

进而:

explainer2 = shap.Explainer(clf.best_estimator_.predict, X_test)
shap_values = explainer2(X_test)
Run Code Online (Sandbox Code Playgroud)

在此输入图像描述

您能解释一下这两个图之间有什么区别以及使用哪一个吗feature importance

这是我的代码:

from sklearn.datasets import make_classification
import seaborn as sns
import numpy as np
import pandas as pd
from matplotlib import pyplot as plt
import pickle
import joblib
import warnings
import shap
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import RandomizedSearchCV, GridSearchCV

f, (ax1,ax2) = plt.subplots(nrows=1, ncols=2,figsize=(20,8))
# Generate noisy …
Run Code Online (Sandbox Code Playgroud)

feature-selection python-3.x random-forest shap

4
推荐指数
1
解决办法
8637
查看次数

Python 中多个类的混淆矩阵

我通过 python 代码生成了多类混淆矩阵:

import seaborn as sns
import matplotlib.pyplot as plt

### Confusion Matrix
from sklearn.metrics import confusion_matrix
predictions = model.predict(x_test, steps=len(x_test), verbose=0)
#y_pred=model.predict(x_test)
#y_pred = np.round(y_pred)
y_pred = np.argmax(predictions, axis=-1)

y_true=np.argmax(y_test, axis=-1)

cm = confusion_matrix(y_true, y_pred)

## Get Class Labels
labels = le.classes_
class_names = labels

# Plot confusion matrix in a beautiful manner
fig = plt.figure(figsize=(16, 14))
ax= plt.subplot()
sns.heatmap(cm, annot=True, ax = ax, fmt = 'g'); #annot=True to annotate cells
# labels, title and ticks
ax.set_xlabel('Predicted', fontsize=20) …
Run Code Online (Sandbox Code Playgroud)

python confusion-matrix

3
推荐指数
1
解决办法
2万
查看次数

从列表转换为元组时从文本文件中删除双引号

我有这些行列表:

lines = (
    f"Begin Test 1",
    f"Begin Test 2",
    f"Begin Test 3",
    f"Begin Test 4",
    f"Begin Test 5",
    f"Begin Test 6",
    )
Run Code Online (Sandbox Code Playgroud)

我只想从包含上面显示的行的第一个文件中获取前两行,并将其连接到另一组行,然后保存新的文本文件。

下面给出的代码使用文本编辑器查看时保存了带有双引号的前两行和不带任何引号的其余行。如何去掉前两行中的双引号并正确保存数据?

这是我当前的代码:

代码

import pandas as pd
import os
import contextlib

#
#%%#######################     SAVE SCORES TO CSV FILE     #####################

lines = (
    f"Begin Test 1",
    f"Begin Test 2",
    f"Begin Test 3",
    f"Begin Test 4",
    f"Begin Test 5",
    f"Begin Test 6",
    )

df = pd.DataFrame(lines)

txtfilename = f'Check1.txt'

# Remove csv file if it exists
with contextlib.suppress(FileNotFoundError): …
Run Code Online (Sandbox Code Playgroud)

tuples list python-3.x

0
推荐指数
1
解决办法
724
查看次数

如何根据列值对 pandas 行进行排序

在此数据框中:

Feat1   Feat2   Feat3   Feat4   Labels
-46.220314  22.862856   -6.1573067  5.6060414   2
-23.80669   20.536781   -5.015675   4.2216353   2
-42.092365  25.680704   -5.0092897  5.665794    2
-35.29639   21.709473   -4.160352   5.578346    2
-37.075096  22.347767   -3.860426   5.6953945   2
-42.8849    28.03802    -7.8572545  3.3361  2
-32.3057    26.568039   -9.47018    3.4532788   2
-24.469942  27.005375   -9.301921   4.3995037   2
-97.89892   -0.38156664 6.4163384   7.234347    1
-81.96325   0.1821717   -1.2870358  4.703838    1
-78.41986   -6.766374   0.8001185   0.83444935  1
-100.68544  -4.5810957  1.6977689   1.8801615   1
-87.05412   -2.9231584  6.817379    5.4460077   1
-64.121056  -3.7892206  -0.283514   6.3084154   1
-94.504845  -0.9999217  3.2884297   6.881124 …
Run Code Online (Sandbox Code Playgroud)

dataframe python-3.x pandas

0
推荐指数
1
解决办法
54
查看次数

检查单元格是Excel VBA中的数字

在下面的代码中,我正在对单元格进行数值计算.但有时,其中一个单元格(例如"LosLimit")可能包含非数字值,如"---".如何检查单元格是否为数字.

如果是数字,则进行计算,否则从"MeasValue"返回值

Sub ReturnMarginal()

    'UpdatebySUPERtoolsforExcel2016
    Dim xOut As Worksheet
    Dim xWb As Workbook
    Dim xWks As Worksheet
    Dim InterSectRange As Range
    Dim lowLimCol As Integer
    Dim hiLimCol As Integer
    Dim measCol As Integer

    Application.ScreenUpdating = False
    Set xWb = ActiveWorkbook
    For Each xWks In xWb.Sheets
        xRow = 1
        With xWks
           FindString = "LowLimit"
           If Not xWks.Rows(1).Find(FindString) Is Nothing Then
               .Cells(xRow, 16) = "Meas-LO"
               .Cells(xRow, 17) = "Meas-Hi"
               .Cells(xRow, 18) = "Min Value"
               .Cells(xRow, 19) = "Marginal"
               LastRow = .UsedRange.Rows.Count
               lowLimCol …
Run Code Online (Sandbox Code Playgroud)

excel vba excel-vba

-2
推荐指数
1
解决办法
1万
查看次数