我是机器学习的新手,我一直在使用无监督学习技术.
该图显示了我的样本数据(完全清理后)屏幕截图: 示例数据
我有两个Pipline用于清理数据:
num_attribs = list(housing_num)
cat_attribs = ["ocean_proximity"]
print(type(num_attribs))
num_pipeline = Pipeline([
('selector', DataFrameSelector(num_attribs)),
('imputer', Imputer(strategy="median")),
('attribs_adder', CombinedAttributesAdder()),
('std_scaler', StandardScaler()),
])
cat_pipeline = Pipeline([
('selector', DataFrameSelector(cat_attribs)),
('label_binarizer', LabelBinarizer())
])
Run Code Online (Sandbox Code Playgroud)
然后我做了这两个管道的联合,相同的代码如下所示:
from sklearn.pipeline import FeatureUnion
full_pipeline = FeatureUnion(transformer_list=[
("num_pipeline", num_pipeline),
("cat_pipeline", cat_pipeline),
])
Run Code Online (Sandbox Code Playgroud)
现在我试图在数据上做fit_transform 但它显示我的错误.
转型代码:
housing_prepared = full_pipeline.fit_transform(housing)
housing_prepared
Run Code Online (Sandbox Code Playgroud)
错误消息:fit_transform()需要2个位置参数,但是给出了3个