小编Kon*_*sch的帖子

使用 ColumTransformer/FeatureUnion 后构建完整数据框（特征值 + 名称）的推荐方法是什么？

我在 Internet 上多次看到这个主题，但从未见过一个完整、全面的解决方案，可以在所有用例中使用 sklearn 的当前库版本。有人可以尝试使用以下示例来解释应该如何实现吗？

data = pd.read_csv('heart.csv')

# Preparing individual pipelines for numerical and categorical features
pipe_numeric = Pipeline(steps=[
    ('impute_num', SimpleImputer(
        missing_values = np.nan, 
        strategy = 'median', 
        copy = False, 
        add_indicator = True)
    )
])

pipe_categorical = Pipeline(steps=[
    ('impute_cat', SimpleImputer(
        missing_values = np.nan, 
        strategy = 'constant', 
        fill_value = 99999,
        copy = False)
    ),
    ('one_hot', OneHotEncoder(handle_unknown='ignore'))
])

# Combining them into a transformer
transformer_union = ColumnTransformer([
    ('feat_numeric', pipe_numeric, ['age']),
    ('feat_categorical', pipe_categorical, ['cp']),
], remainder = 'passthrough')

# Fitting the …

Run Code Online (Sandbox Code Playgroud)

python pandas scikit-learn

Kon*_*sch

2019 11-22

5
推荐指数

1
解决办法

89
查看次数