使用 YellowBrick 的分类报告

Question

使用 YellowBrick 的分类报告

gen*_*dry 0 python machine-learning neural-network yellowbrick

我最近在 iris 数据集上实现了概率神经网络。我试图使用 YellowBrick 分类器打印分类报告，但是当我运行此代码时出现错误。如下所示。

from neupy import algorithms
model = algorithms.PNN(std=0.1, verbose=True, batch_size = 500)
model.train(X_train, Y_train)
predictions = model.predict(X_test)


from yellowbrick.classifier import ClassificationReport
visualizer = ClassificationReport(model, support=True)

visualizer.fit(X_train, Y_train)  # Fit the visualizer and the model
visualizer.score(X_test, Y_test)  # Evaluate the model on the test data
visualizer.show()

Run Code Online (Sandbox Code Playgroud)

此代码返回此错误。

YellowbrickTypeError: This estimator is not a classifier; try a regression or clustering score visualizer instead!

Run Code Online (Sandbox Code Playgroud)

当我为其他分类模型尝试相同的分类报告代码时，它起作用了。我不知道。为什么会这样？谁能帮我解决这个问题？

Answer 1

bbe*_*ort 5

Yellowbrick 旨在与 scikit-learn 一起使用，并使用 sklearn 的类型检查系统来检测模型是否适合特定类别的机器学习问题。如果 neupyPNN模型实现了 scikit-learn estimator API（例如fit()和predict()） - 可以直接使用模型并通过使用force_model=True参数绕过类型检查，如下所示：

visualizer = ClassificationReport(model, support=True, force_model=True)

Run Code Online (Sandbox Code Playgroud)

然而，在快速浏览neupy 文档后，这似乎不一定有效，因为 neupy 方法被命名train而不是命名，fit并且因为 PNN 模型没有实现score()方法，也不支持_后缀学习参数。

解决方案是创建一个围绕PNN模型的轻量级包装器，将其作为 sklearn 估计器公开。在 Yellowbrick 数据集上进行测试，这似乎有效：

from sklearn import metrics
from neupy import algorithms
from sklearn.base import BaseEstimator
from yellowbrick.datasets import load_occupancy
from yellowbrick.classifier import ClassificationReport
from sklearn.model_selection import train_test_split


class PNNWrapper(algorithms.PNN, BaseEstimator):
    """
    The PNN wrapper implements BaseEstimator and allows the classification
    report to score the model and understand the learned classes.
    """

    @property
    def classes_(self):
        return self.classes

    def score(self, X_test, y_test):
        y_hat = self.predict(X_test)
        return metrics.accuracy_score(y_test, y_hat)


# Load the binary classification dataset 
X, y = load_occupancy()
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)

# Create and train the PNN model using the sklearn wrapper
model = PNNWrapper(std=0.1, verbose=True, batch_size=500)
model.train(X_train, y_train)

# Create the classification report
viz = ClassificationReport(
    model, 
    support=True, 
    classes=["not occupied", "occupied"], 
    is_fitted=True, 
    force_model=True, 
    title="PNN"
)

# Score the report and show it
viz.score(X_test, y_test)
viz.show()

Run Code Online (Sandbox Code Playgroud)

尽管 Yellowbrick 目前不支持 neupy，但如果您有兴趣 - 可能值得提交一个问题，建议将 neupy 添加到 contrib，类似于在 Yellowbrickstatsmodels中的实现方式。

归档时间：	6 年，7 月前
查看次数：	1054 次
最近记录：	6 年，7 月前