标签: scikit-optimize

在 scikit-learn 中使用 skopt.BayesSearchCV 时如何修复“numpy.int”属性错误？

当我在官方文档上运行以下代码时，出现错误。

from skopt import BayesSearchCV
from sklearn.datasets import load_digits
from sklearn.svm import SVC
from sklearn.model_selection import train_test_split

X, y = load_digits(n_class=10, return_X_y=True)
X_train, X_test, y_train, y_test = train_test_split(X, y, train_size=0.75, test_size=.25, random_state=0)

# log-uniform: understand as search over p = exp(x) by varying x
opt = BayesSearchCV(
    SVC(),
    {
        'C': (1e-6, 1e+6, 'log-uniform'),
        'gamma': (1e-6, 1e+1, 'log-uniform'),
        'degree': (1, 8),  # integer valued parameter
        'kernel': ['linear', 'poly', 'rbf'],  # categorical parameter
    },
    n_iter=32,
    cv=3
)

opt.fit(X_train, y_train)

Run Code Online (Sandbox Code Playgroud)

最后一行产生错误：

AttributeError …

python machine-learning scikit-learn scikit-optimize bayessearchcv

san*_*tus

2023 05-25

8
推荐指数

2
解决办法

5553
查看次数

XGBoost 和 scikit-optimize：BayesSearchCV 和 XGBRegressor 不兼容 - 为什么？

我有一个非常大的数据集（700 万行，54 个特征），我想拟合回归模型以使用XGBoost. 为了训练最好的模型，我想使用BayesSearchCVfromscikit-optimize对不同的超参数组合重复运行拟合，直到找到性能最佳的集合。

对于给定的超参数集，XGBoost需要很长时间来训练模型，因此为了找到最佳超参数而无需花费数天时间处理训练折叠、超参数等的每个排列，我想同时对XGBoost和进行多线程处理BayesSearchCV。我的代码的相关部分如下所示：

xgb_pipe = Pipeline([('clf', XGBRegressor(random_state = 42,  objective='reg:squarederror', n_jobs = 1))])

xgb_fit_params = {'clf__early_stopping_rounds': 5, 'clf__eval_metric': 'mae', 'clf__eval_set': [[X_val.values, y_val.values]]}

xgb_kfold = KFold(n_splits = 5, random_state = 42)

xgb_unsm_cv = BayesSearchCV(xgb_pipe, xgb_params, cv = xgb_kfold, n_jobs = 2, n_points = 1, n_iter = 15, random_state = 42, verbose = 4, scoring = 'neg_mean_absolute_error', fit_params = xgb_fit_params)

xgb_unsm_cv.fit(X_train.values, y_train.values)

Run Code Online (Sandbox Code Playgroud)

但是，我发现n_jobs > 1在BayesSearchCV …

python multithreading xgboost scikit-optimize bayesian-deep-learning

Ele*_*Ant

lucky-day

5
推荐指数

1
解决办法

173
查看次数