测试sklearn模型是否适合的最佳方法是什么？

Question

测试sklearn模型是否适合的最佳方法是什么？

检查sklearn模型是否已安装的最优雅方法是什么？即它是否fit()在实例化之后调用了它的函数.

Answer 1

你可以这样做:

from sklearn.exceptions import NotFittedError

for model in models:
    try:
        model.predict(some_test_data)
    except NotFittedError as e:
        print(repr(e))

Run Code Online (Sandbox Code Playgroud)

理想情况下,您可以检查model.predict预期结果的结果,但如果您想知道是否所有模型都适合,那么就足够了.

更新:

一些评论者建议使用check_is_fitted.我认为check_is_fitted是一种内部方法.大多数算法会调用check_is_fitted他们的预测方法,而预测方法可能会NotFittedError在需要时提升.check_is_fitted直接使用的问题在于它是特定于模型的,即您需要根据算法知道要检查的成员.例如:

???????????????????????????????????????????????????????????????
? Tree models    ? check_is_fitted(self, 'tree_')             ?
? Linear models  ? check_is_fitted(self, 'coefs_')            ?
? KMeans         ? check_is_fitted(self, 'cluster_centers_')  ?
? SVM            ? check_is_fitted(self, 'support_')          ?
???????????????????????????????????????????????????????????????

Run Code Online (Sandbox Code Playgroud)

等等.所以一般情况下我会建议调用model.predict()并让特定算法处理检查它是否已经安装的最佳方法.

这不是一个好的解决方案，因为您的`try`可能由于some_test_data`问题而失败。例如，如果`some_test_data =“ duck”`，该模型可能会完全正常，但会失败。然后，您将报告错误的错误。改用[check_is_fitted]（http://scikit-learn.org/stable/modules/generation/sklearn.utils.validation.check_is_fitted.html） (2认同)
@sapo_cosmico，except 子句只捕获`NotFittedError`，所以这个错误会在正确的情况下报告。鉴于`.predict()` 调用`check_is_fitted` 然后引发`NotFittedError` 我想不出会显示错误错误的情况，但如果我遗漏了什么，请随时纠正我。 (2认同)

Answer 2

O.r*_*rka 5

我为分类器执行此操作：

def check_fitted(clf): 
    return hasattr(clf, "classes_")

Run Code Online (Sandbox Code Playgroud)

归档时间：	9 年，3 月前
查看次数：	4416 次
最近记录：	7 年，6 月前