Scikit随机森林分类器未评估为True

sap*_*ico 3 python machine-learning scikit-learn

好奇的边缘行为.在这个例子中,KNN exists被打印,但Random Forest exists没有.

在检查模型是否存在时发现它,if model: ...当模型是随机森林时未触发模型.

from sklearn.ensemble import RandomForestClassifier
from sklearn.neighbors import KNeighborsClassifier

if KNeighborsClassifier(4):
    print('KNN exists')

if RandomForestClassifier(n_estimators=10, max_depth=4):
    print('Random Forest exists')
Run Code Online (Sandbox Code Playgroud)

为什么会这样?

jua*_*aga 5

啊哈!这是因为Random工具__len__:

In [1]: from sklearn.ensemble import RandomForestClassifier
   ...: from sklearn.neighbors import KNeighborsClassifier
   ...:

In [2]: knn =  KNeighborsClassifier(4)

In [3]: forest = RandomForestClassifier(n_estimators=10, max_depth=4)

In [4]: knn.__bool__
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-4-ef1cfe16be77> in <module>()
----> 1 knn.__bool__

AttributeError: 'KNeighborsClassifier' object has no attribute '__bool__'

In [5]: knn.__len__
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-5-dc98bf8c50e0> in <module>()
----> 1 knn.__len__

AttributeError: 'KNeighborsClassifier' object has no attribute '__len__'

In [6]: forest.__bool__
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-6-fbdd7f01e843> in <module>()
----> 1 forest.__bool__

AttributeError: 'RandomForestClassifier' object has no attribute '__bool__'

In [7]: forest.__len__
Out[7]:
<bound method BaseEnsemble.__len__ of RandomForestClassifier(bootstrap=True, class_weight=None, criterion='gini',
            max_depth=4, max_features='auto', max_leaf_nodes=None,
            min_impurity_split=1e-07, min_samples_leaf=1,
            min_samples_split=2, min_weight_fraction_leaf=0.0,
            n_estimators=10, n_jobs=1, oob_score=False, random_state=None,
            verbose=0, warm_start=False)>

In [8]: len(forest)
Out[8]: 0
Run Code Online (Sandbox Code Playgroud)

而且,根据Python数据模型:

object.__bool__(self)

被称为实施真值测试和内置操作 bool(); 应该返回False或True.如果未定义此方法,__len__()则调用此方法( 如果已定义),如果其结果为非零,则将该对象视为true.如果一个类既未定义也__len__() 未定义__bool__(),则其所有实例都被视为真.

正如人们所预料的那样,lena RandomForestClassifier是估计量的数量,但只有之后.fit:

In [9]: from sklearn.datasets import make_classification
   ...: X, y = make_classification(n_samples=1000, n_features=4,
   ...:             n_informative=2, n_redundant=0,
   ...:             random_state=0, shuffle=False)
   ...:

In [10]: X.shape
Out[10]: (1000, 4)

In [11]: y.shape
Out[11]: (1000,)

In [12]: forest.fit(X,y)
Out[12]:
RandomForestClassifier(bootstrap=True, class_weight=None, criterion='gini',
            max_depth=4, max_features='auto', max_leaf_nodes=None,
            min_impurity_split=1e-07, min_samples_leaf=1,
            min_samples_split=2, min_weight_fraction_leaf=0.0,
            n_estimators=10, n_jobs=1, oob_score=False, random_state=None,
            verbose=0, warm_start=False)

In [13]: len(forest)
Out[13]: 10
Run Code Online (Sandbox Code Playgroud)