名称'DataFrameSelector'未定义

Isa*_*c A 6 python pipeline scikit-learn

我目前正在阅读"Scikit-Learn&TensorFlow的动手机器学习".当我尝试重新创建Transformation Pipelines代码时出错.我怎样才能解决这个问题?

码:

from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler

num_pipeline = Pipeline([('imputer', Imputer(strategy = "median")),
                        ('attribs_adder', CombinedAttributesAdder()),
                        ('std_scaler', StandardScaler()),
                        ])

housing_num_tr = num_pipeline.fit_transform(housing_num)

from sklearn.pipeline import FeatureUnion

num_attribs = list(housing_num)
cat_attribs = ["ocean_proximity"]

num_pipeline = Pipeline([
                         ('selector', DataFrameSelector(num_attribs)),
                         ('imputer', Imputer(strategy = "median")),
                         ('attribs_adder', CombinedAttributesAdder()),
                         ('std_scaler', StandardScaler()),
                        ])

cat_pipeline = Pipeline([('selector', DataFrameSelector(cat_attribs)), 
                         ('label_binarizer', LabelBinarizer()),
                        ])

full_pipeline = FeatureUnion(transformer_list = [("num_pipeline", num_pipeline), 
                                                 ("cat_pipeline", cat_pipeline),
                                                ])

# And we can now run the whole pipeline simply:

housing_prepared = full_pipeline.fit_transform(housing)
housing_prepared
Run Code Online (Sandbox Code Playgroud)

错误:

---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-350-3a4a39e5bc1c> in <module>()
     43 
     44 num_pipeline = Pipeline([
---> 45                          ('selector', DataFrameSelector(num_attribs)),
     46                          ('imputer', Imputer(strategy = "median")),
     47                          ('attribs_adder', CombinedAttributesAdder()),

NameError: name 'DataFrameSelector' is not defined
Run Code Online (Sandbox Code Playgroud)

Ste*_*uch 16

DataFrameSelector没有被发现,需要进口.它不是其中的一部分sklearn,但sklearn功能中提供了相同名称的内容:

from sklearn_features.transformers import DataFrameSelector
Run Code Online (Sandbox Code Playgroud)

(DOCS)


dr2*_*509 7

from sklearn.base import BaseEstimator, TransformerMixin

class DataFrameSelector(BaseEstimator, TransformerMixin):
    def __init__(self, attribute_names):
        self.attribute_names=attribute_names
    def fit(self, X, y=None):
        return self
    def transform(self, X):
        return X[self.attribute_names].values
Run Code Online (Sandbox Code Playgroud)

这应该工作。

  • 有时最好参考以下来源:使用Scikit-Learn和TensorFlow进行动手机器学习第97页 (8认同)

小智 6

如果您正在使用 Sklearn 和 Tensorflow 关注机器学习之手,请查看下一页,自定义数据帧生成器

from sklearn.pipeline import FeatureUnion
class DataFrameSelector(BaseEstimator, TransformerMixin):
    def __init__(self, attribute_names):
        self.attribute_names = attribute_names
    def fit(self, X, y=None):
        return self
    def transform(self, X):
        return X[self.attribute_names].values
Run Code Online (Sandbox Code Playgroud)