小编Ale*_*ses的帖子

即使整个管道都安装了，管道中的 Sklearn 组件也没有安装？

我试图从安装好的管道中挑出一个组件/变压器来检查它的行为。但是，当我检索组件时，该组件显示为未安装，但是将管道作为一个整体使用是没有问题的。这表明管道已安装，组件也已安装。

有人可以解释原因，并建议如何检查已安装管道中的组件吗？

这是一个可重现的示例：

import pandas as pd
import numpy as np

from sklearn.compose import ColumnTransformer
from sklearn.pipeline import Pipeline
from sklearn.impute import SimpleImputer
from sklearn.preprocessing import StandardScaler, OneHotEncoder
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split, GridSearchCV

np.random.seed(0)

# Read data from Titanic dataset.
titanic_url = ('https://raw.githubusercontent.com/amueller/'
               'scipy-2017-sklearn/091d371/notebooks/datasets/titanic3.csv')
data = pd.read_csv(titanic_url)

# We create the preprocessing pipelines for both numeric and categorical data.
numeric_features = ['age', 'fare']
numeric_transformer = Pipeline(steps=[
    ('imputer', SimpleImputer(strategy='median')),
    ('scaler', StandardScaler())])

categorical_features = ['embarked', 'sex', 'pclass']
categorical_transformer …

Run Code Online (Sandbox Code Playgroud)

python pipeline scikit-learn

Ale*_*ses

lucky-day

10
推荐指数

1
解决办法

831
查看次数

如何使用 GridSearchCV 通过 train_test_split 策略调整参数？

我正在尝试使用 train_test_split 策略微调我的 sklearn 模型。我知道GridSearchCV执行参数调整的能力，但是，它与使用交叉验证策略相关，我想使用 train_test_split 策略进行参数搜索，因为训练速度对我的情况很重要，我更喜欢简单train_test_split通过交叉验证。

我可以尝试编写自己的 for 循环，但如果不利用 GridSearchCV 中使用的内置并行化，效率会很低。

有人知道如何利用 GridSearchCV 来实现这一点吗？或者提供一个不太慢的替代方案。

python scikit-learn train-test-split gridsearchcv

Ale*_*ses

lucky-day

0
推荐指数

1
解决办法

2006
查看次数