cd9*_*d98 21 python scikit-learn xgboost
我搜索了sklearn文档 TimeSeriesSplit和交叉验证文档,但我找不到一个有效的例子.
我正在使用sklearn版本0.19.
这是我的设置
import xgboost as xgb
from sklearn.model_selection import TimeSeriesSplit
from sklearn.grid_search import GridSearchCV
import numpy as np
X = np.array([[4, 5, 6, 1, 0, 2], [3.1, 3.5, 1.0, 2.1, 8.3, 1.1]]).T
y = np.array([1, 6, 7, 1, 2, 3])
tscv = TimeSeriesSplit(n_splits=2)
for train, test in tscv.split(X):
print(train, test)
Run Code Online (Sandbox Code Playgroud)
得到:
[0 1] [2 3]
[0 1 2 3] [4 5]
Run Code Online (Sandbox Code Playgroud)
如果我尝试:
model = xgb.XGBRegressor()
param_search = {'max_depth' : [3, 5]}
my_cv = TimeSeriesSplit(n_splits=2).split(X)
gsearch = GridSearchCV(estimator=model, cv=my_cv,
param_grid=param_search)
gsearch.fit(X, y)
Run Code Online (Sandbox Code Playgroud)
它给: TypeError: object of type 'generator' has no len()
我遇到了问题:GridSearchCV试图调用len(cv)但是my_cv没有长度的迭代器.但是,状态I 的文档GridSearchCV可以使用
int,交叉验证生成器或可迭代的,可选的
我尝试使用TimeSeriesSplit没有.split(X)但它仍然无法正常工作.
我敢肯定我会忽略一些简单的事情,谢谢!
cd9*_*d98 21
原来,问题是我使用GridSearchCV的sklearn.grid_search,它被废弃了.导入GridSearchCV从sklearn.model_selection解决了这个问题:
import xgboost as xgb
from sklearn.model_selection import TimeSeriesSplit, GridSearchCV
import numpy as np
X = np.array([[4, 5, 6, 1, 0, 2], [3.1, 3.5, 1.0, 2.1, 8.3, 1.1]]).T
y = np.array([1, 6, 7, 1, 2, 3])
model = xgb.XGBRegressor()
param_search = {'max_depth' : [3, 5]}
tscv = TimeSeriesSplit(n_splits=2)
gsearch = GridSearchCV(estimator=model, cv=tscv,
param_grid=param_search)
gsearch.fit(X, y)
Run Code Online (Sandbox Code Playgroud)
得到:
GridSearchCV(cv=<generator object TimeSeriesSplit.split at 0x11ab4abf8>,
error_score='raise',
estimator=XGBRegressor(base_score=0.5, colsample_bylevel=1, colsample_bytree=1, gamma=0,
learning_rate=0.1, max_delta_step=0, max_depth=3,
min_child_weight=1, missing=None, n_estimators=100, nthread=-1,
objective='reg:linear', reg_alpha=0, reg_lambda=1,
scale_pos_weight=1, seed=0, silent=True, subsample=1),
fit_params=None, iid=True, n_jobs=1,
param_grid={'max_depth': [3, 5]}, pre_dispatch='2*n_jobs',
refit=True, return_train_score=True, scoring=None, verbose=0)
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
7509 次 |
| 最近记录: |