如何在python/R中访问xgboost模型的各个树

ash*_*his 11 python r machine-learning scikit-learn xgboost

如何在python/R中访问xgboost模型的各个树?

下面我来自sklearn的随机森林树木.

estimator = RandomForestRegressor(oob_score = True,n_estimators = 10,max_features ='auto')estimator.fit(tarning_data,traning_target)tree1 = estimator.estimators_ [0] leftChild
= tree1.tree_.children_left rightChild = tree1.tree_.children_right

pom*_*ber 6

您要检查树木吗?

在Python中,您可以将树作为字符串列表转储:

m = xgb.XGBClassifier(max_depth=2, n_estimators=3).fit(X, y)
m.get_booster().get_dump()
Run Code Online (Sandbox Code Playgroud)

>

['0:[sincelastrun<23.2917] yes=1,no=2,missing=2\n\t1:[sincelastrun<18.0417] yes=3,no=4,missing=4\n\t\t3:leaf=-0.0965415\n\t\t4:leaf=-0.0679503\n\t2:[sincelastrun<695.025] yes=5,no=6,missing=6\n\t\t5:leaf=-0.0992546\n\t\t6:leaf=-0.0984374\n',
 '0:[sincelastrun<23.2917] yes=1,no=2,missing=2\n\t1:[sincelastrun<16.8917] yes=3,no=4,missing=4\n\t\t3:leaf=-0.0928132\n\t\t4:leaf=-0.0676056\n\t2:[sincelastrun<695.025] yes=5,no=6,missing=6\n\t\t5:leaf=-0.0945284\n\t\t6:leaf=-0.0937463\n',
 '0:[sincelastrun<23.2917] yes=1,no=2,missing=2\n\t1:[sincelastrun<18.175] yes=3,no=4,missing=4\n\t\t3:leaf=-0.0878571\n\t\t4:leaf=-0.0610089\n\t2:[sincelastrun<695.025] yes=5,no=6,missing=6\n\t\t5:leaf=-0.0904395\n\t\t6:leaf=-0.0896808\n']
Run Code Online (Sandbox Code Playgroud)

或将它们转储到文件中(格式不错):

m.get_booster().dump_model("out.txt")
Run Code Online (Sandbox Code Playgroud)

>

booster[0]:
0:[sincelastrun<23.2917] yes=1,no=2,missing=2
    1:[sincelastrun<18.0417] yes=3,no=4,missing=4
        3:leaf=-0.0965415
        4:leaf=-0.0679503
    2:[sincelastrun<695.025] yes=5,no=6,missing=6
        5:leaf=-0.0992546
        6:leaf=-0.0984374
booster[1]:
0:[sincelastrun<23.2917] yes=1,no=2,missing=2
    1:[sincelastrun<16.8917] yes=3,no=4,missing=4
        3:leaf=-0.0928132
        4:leaf=-0.0676056
    2:[sincelastrun<695.025] yes=5,no=6,missing=6
        5:leaf=-0.0945284
        6:leaf=-0.0937463
booster[2]:
0:[sincelastrun<23.2917] yes=1,no=2,missing=2
    1:[sincelastrun<18.175] yes=3,no=4,missing=4
        3:leaf=-0.0878571
        4:leaf=-0.0610089
    2:[sincelastrun<695.025] yes=5,no=6,missing=6
        5:leaf=-0.0904395
        6:leaf=-0.0896808
Run Code Online (Sandbox Code Playgroud)

  • 一个更容易阅读的东西是 model.get_booster().trees_to_dataframe() ,它在 pandas DataFrame 中输出这个字符串。 (9认同)
  • 如何分别使用每棵树进行分类并评估每棵树? (4认同)