小编xia*_*hao的帖子

结合 GridSearchCV 和 StackingClassifier

我想使用 StackingClassifier 组合一些分类器，然后使用 GridSearchCV 来优化参数：

clf1 = RandomForestClassifier()
clf2 = LogisticRegression()
dt = DecisionTreeClassifier()
sclf = StackingClassifier(estimators=[clf1, clf2],final_estimator=dt)

params = {'randomforestclassifier__n_estimators': [10, 50],
          'logisticregression__C': [1,2,3]}

grid = GridSearchCV(estimator=sclf, param_grid=params, cv=5)

grid.fit(x, y)

Run Code Online (Sandbox Code Playgroud)

但这结果是一个错误：

'RandomForestClassifier' object has no attribute 'estimators_'

Run Code Online (Sandbox Code Playgroud)

我用过n_estimators。为什么它警告我没有estimators_？

通常 GridSearchCV 应用于单个模型，所以我只需要在 dict 中写入单个模型的参数名称。

我参考此页面https://groups.google.com/d/topic/mlxtend/5GhZNwgmtSg但它使用早期版本的参数。即使我更改了新参数，它也不起作用。

顺便说一句，我在哪里可以了解这些参数的命名规则的详细信息？

python scikit-learn gridsearchcv

xia*_*hao

2020 05-10

4
推荐指数

1
解决办法

1795
查看次数

如何在 pandas 中使用 groupby 获取 datediff ？

我有一个日期框“df”，用于存储用户的订单：

    user_id order_date
0         a 2018-01-17
1         a 2018-04-29
2         a 2018-05-19
3         a 2018-05-21
4         a 2018-06-15
5         b 2018-09-18
6         b 2019-01-30
7         b 2019-02-01
8         b 2019-07-03
9         c 2019-07-31
10        c 2019-12-10
11        c 2019-12-12
12        c 2019-12-24

Run Code Online (Sandbox Code Playgroud)

“order_date”已订购。我想知道不同订单的不同用户的日期差异。我需要使用“groupby”来分隔用户，然后计算 datediff。结果应该是：

    user_id   datediff
0         a         NA
1         a        102
2         a         20
3         a          2
4         a         25
5         b         NA
6         b        134
7         b          2
8         b        152
9         c         NA
10        c        132
11 …

Run Code Online (Sandbox Code Playgroud)

python dataframe pandas

xia*_*hao

2020 04-21

1
推荐指数

1
解决办法

1513
查看次数