我在 Internet 上多次看到这个主题,但从未见过一个完整、全面的解决方案,可以在所有用例中使用 sklearn 的当前库版本。有人可以尝试使用以下示例来解释应该如何实现吗?
data = pd.read_csv('heart.csv')
# Preparing individual pipelines for numerical and categorical features
pipe_numeric = Pipeline(steps=[
('impute_num', SimpleImputer(
missing_values = np.nan,
strategy = 'median',
copy = False,
add_indicator = True)
)
])
pipe_categorical = Pipeline(steps=[
('impute_cat', SimpleImputer(
missing_values = np.nan,
strategy = 'constant',
fill_value = 99999,
copy = False)
),
('one_hot', OneHotEncoder(handle_unknown='ignore'))
])
# Combining them into a transformer
transformer_union = ColumnTransformer([
('feat_numeric', pipe_numeric, ['age']),
('feat_categorical', pipe_categorical, ['cp']),
], remainder = 'passthrough')
# Fitting the …
Run Code Online (Sandbox Code Playgroud)