XGBRegressor:改变random_state无效

Lin*_*gxB 2 python-3.x xgboost

xgboost.XGBRegressor似乎产生尽管新的随机种子被赋予了相同的结果.

根据xgboost文件xgboost.XGBRegressor:

seed:int随机数种子.(已弃用,请使用random_state)

random_state:int随机数种子.(取代种子)

random_state是要使用的一个,但是,不管是什么random_state或者seed我用的,该模型产生相同的结果.一个Bug?

from xgboost import XGBRegressor
from sklearn.datasets import load_boston
import numpy as np
from itertools import product

def xgb_train_predict(random_state=0, seed=None):
    X, y = load_boston(return_X_y=True)
    xgb = XGBRegressor(random_state=random_state, seed=seed)
    xgb.fit(X, y)
    y_ = xgb.predict(X)
    return y_

check = xgb_train_predict()

random_state = [1, 42, 58, 69, 72]
seed = [None, 2, 24, 85, 96]

for r, s in product(random_state, seed):
    y_ = xgb_train_predict(r, s)
    assert np.equal(y_, check).all()
    print('CHECK! \t random_state: {} \t seed: {}'.format(r, s))

[Out]:
    CHECK!   random_state: 1     seed: None
    CHECK!   random_state: 1     seed: 2
    CHECK!   random_state: 1     seed: 24
    CHECK!   random_state: 1     seed: 85
    CHECK!   random_state: 1     seed: 96
    CHECK!   random_state: 42    seed: None
    CHECK!   random_state: 42    seed: 2
    CHECK!   random_state: 42    seed: 24
    CHECK!   random_state: 42    seed: 85
    CHECK!   random_state: 42    seed: 96
    CHECK!   random_state: 58    seed: None
    CHECK!   random_state: 58    seed: 2
    CHECK!   random_state: 58    seed: 24
    CHECK!   random_state: 58    seed: 85
    CHECK!   random_state: 58    seed: 96
    CHECK!   random_state: 69    seed: None
    CHECK!   random_state: 69    seed: 2
    CHECK!   random_state: 69    seed: 24
    CHECK!   random_state: 69    seed: 85
    CHECK!   random_state: 69    seed: 96
    CHECK!   random_state: 72    seed: None
    CHECK!   random_state: 72    seed: 2
    CHECK!   random_state: 72    seed: 24
    CHECK!   random_state: 72    seed: 85
    CHECK!   random_state: 72    seed: 96
Run Code Online (Sandbox Code Playgroud)

Myk*_*vyi 5

似乎(在开始挖掘答案之前我自己并不知道:)),xgboost仅使用随机生成器进行子采样,请参阅此Laurae对类似github问题的评论.否则行为是确定性的.

如果您使用了抽样,xgboost中当前sklearn API 的seed/ random_state处理存在问题.seed确实声称已被弃用,但似乎如果提供它,它仍将被使用random_state,如代码中所示.此评论仅在您拥有时才有意义seed not None