如何使用keras regressor将scikit-learn pipline保存到磁盘中？

Question

如何使用keras regressor将scikit-learn pipline保存到磁盘中？

Dro*_*man 12 python machine-learning scikit-learn joblib keras

我有一个带kerasRegressor的scikit-learn pipline:

estimators = [
    ('standardize', StandardScaler()),
    ('mlp', KerasRegressor(build_fn=baseline_model, nb_epoch=5, batch_size=1000, verbose=1))
    ]
pipeline = Pipeline(estimators)

Run Code Online (Sandbox Code Playgroud)

之后,训练pipline,我试图使用joblib保存到磁盘...

joblib.dump(pipeline, filename , compress=9)

Run Code Online (Sandbox Code Playgroud)

但我收到一个错误:

RuntimeError:超出最大递归深度

你如何将管道保存到磁盘？

Answer 1

con*_*stt 15

我遇到了同样的问题,因为没有直接的方法可以做到这一点.这是一个适合我的黑客.我将管道保存为两个文件.第一个文件存储了sklearn管道的pickle对象,第二个文件用于存储Keras模型:

...
from keras.models import load_model
from sklearn.externals import joblib

...

pipeline = Pipeline([
    ('scaler', StandardScaler()),
    ('estimator', KerasRegressor(build_model))
])

pipeline.fit(X_train, y_train)

# Save the Keras model first:
pipeline.named_steps['estimator'].model.save('keras_model.h5')

# This hack allows us to save the sklearn pipeline:
pipeline.named_steps['estimator'].model = None

# Finally, save the pipeline:
joblib.dump(pipeline, 'sklearn_pipeline.pkl')

del pipeline

Run Code Online (Sandbox Code Playgroud)

以下是模型的加载方式:

# Load the pipeline first:
pipeline = joblib.load('sklearn_pipeline.pkl')

# Then, load the Keras model:
pipeline.named_steps['estimator'].model = load_model('keras_model.h5')

y_pred = pipeline.predict(X_test)

Run Code Online (Sandbox Code Playgroud)

归档时间：	9 年，4 月前
查看次数：	3847 次
最近记录：	7 年，11 月前