我通常得到这样的PCA负载:
pca = PCA(n_components=2)
X_t = pca.fit(X).transform(X)
loadings = pca.components_
Run Code Online (Sandbox Code Playgroud)
如果我PCA使用scikit-learnpipline 运行...
from sklearn.pipeline import Pipeline
pipeline = Pipeline(steps=[
('scaling',StandardScaler()),
('pca',PCA(n_components=2))
])
X_t=pipeline.fit_transform(X)
Run Code Online (Sandbox Code Playgroud)
......有可能获得负荷吗?
只是尝试loadings = pipeline.components_失败:
AttributeError: 'Pipeline' object has no attribute 'components_'
Run Code Online (Sandbox Code Playgroud)
谢谢!
(也有兴趣coef_从学习管道中提取属性.)
我正在关注在 github上的sklearn_pandas README 中找到的sklearn_pandas 演练,并尝试修改 DateEncoder() 自定义转换器示例以执行另外两件事:
这是我的尝试(对 sklearn 管道有相当初步的了解):
import pandas as pd
import numpy as np
from sklearn.base import TransformerMixin, BaseEstimator
from sklearn_pandas import DataFrameMapper
class DateEncoder(TransformerMixin):
'''
Specify date format using python strftime formats
'''
def __init__(self, date_format='%Y-%m-%d'):
self.date_format = date_format
def fit(self, X, y=None):
self.dt = pd.to_datetime(X, format=self.date_format)
return self
def transform(self, X):
dt = X.dt
return pd.concat([dt.year, dt.month, dt.day], axis=1)
data = pd.DataFrame({'dates1': ['2001-12-20','2002-10-21','2003-08-22','2004-08-23',
'2004-07-20','2007-12-21','2006-12-22','2003-04-23'], …Run Code Online (Sandbox Code Playgroud)