调用 XGBoost .fit 后的 Python sklearn NotFittedError

sbs*_*202 5 python machine-learning scikit-learn

我正在尝试在 XGBoost 拟合模型上使用 sklearn plot_partial_dependence 函数,即在调用 .fit 之后。但我不断收到错误消息:

NotFittedError:此 XGBRegressor 实例尚未安装。在使用此估计器之前,使用适当的参数调用 'fit'。

以下是我使用虚拟数据集采取的步骤。

带有虚拟数据的完整示例:

import numpy as np
# dummy dataset
from sklearn.datasets import make_regression
X_train, y_train = make_regression(n_samples = 1000, n_features = 10)


# Import xgboost
import xgboost as xgb

# Initialize the model 
model_xgb_1 = xgb.XGBRegressor(max_depth = 5, 
                               learning_rate = 0.01, 
                               n_estimators = 100, 
                               objective = 'reg:squarederror', 
                               booster = 'gbtree') 

# Fit the model 
# Not assigning to a new variable 
model_xgb_1.fit(X_train, y_train)

# Just to check that .predict can be called and works
# without error 
print(np.sum(model_xgb_1.predict(X_train)))
# the above works ok and prints the output

#This next step throws an error:
from sklearn.inspection import plot_partial_dependence
plot_partial_dependence(model_xgb_1, X_train, [0])
Run Code Online (Sandbox Code Playgroud)

输出:

662.3468

NotFittedError:此 XGBRegressor 实例尚未安装。在使用此估计器之前,使用适当的参数调用 'fit'。

更新

booster = 'gblinear' 情况下的解决方法

# CHANGE 1/2: Use booster = 'gblinear'
# as no coef are returned for the case of 'gbtree' 
model_xgb_1 = xgb.XGBRegressor(max_depth = 5, 
                               learning_rate = 0.01, 
                               n_estimators = 100, 
                               objective = 'reg:squarederror', 
                               booster = 'gblinear') 

# Fit the model 
# Not assigning to a new variable 
model_xgb_1.fit(X_train, y_train)

# Just to check that .predict can be called and works
# without error 
print(np.sum(model_xgb_1.predict(X_train)))
# the above works ok and prints the output


#This next step throws an error:
from sklearn.inspection import plot_partial_dependence
plot_partial_dependence(model_xgb_1, X_train, [0])

# CHANGE 2/2
# Add the following:
model_xgb_1.coef__ = model_xgb_1.coef_
model_xgb_1.intercept__ = model_xgb_1.intercept_

# Now call plot_partial_dependence --- It works ok
from sklearn.inspection import plot_partial_dependence
plot_partial_dependence(model_xgb_1, X_train, [0])
Run Code Online (Sandbox Code Playgroud)

Ska*_* HR 0

为了避免此错误,请勿影响变量的拟合模型。

# Import xgboost
import xgboost as xgb

# Initialize the model 
model_xgb_1 = xgb.XGBRegressor(max_depth = max_depth, 
                               learning_rate = shrinkage, 
                               n_estimators = nTrees, 
                               objective = 'reg:squarederror', 
                               booster = 'gbtree') 

# Fit the model 
model_xgb_1.fit(X_train, y_train)

# Just to check that .predict can be called and works
# without error 
model_xgb_1.predict(X_train)
# the above works ok and prints the output

#This next step throws an error:
from sklearn.inspection import plot_partial_dependence
plot_partial_dependence(model_xgb_1, X_train, [0])
Run Code Online (Sandbox Code Playgroud)