如何使用python pickle库(或任何其他高效的库)保存scikit-learn的多个分类器模型

udo*_*984 0 python pickle scikit-learn

通常,我们可以使用pickle保存一个分类器模型。有没有一种方法可以将多个分类器模型保存在一个泡菜中?如果是,我们如何保存模型并在以后检索它?

例如,(最小的工作示例)

from sklearn import model_selection
from sklearn.linear_model import LogisticRegression
from sklearn.tree import DecisionTreeClassifier
from numpy.random import rand, randint 

models = []
models.append(('LogisticReg', LogisticRegression(random_state=123)))
models.append(('DecisionTree', DecisionTreeClassifier(random_state=123)))
# evaluate each model in turn
results_all = []
names = []
dict_method_score = {}
scoring = 'f1'

X = rand(8, 4)
Y = randint(2, size=8)

print("Method: Average (Standard Deviation)\n")
for name, model in models:
    kfold = model_selection.KFold(n_splits=2, random_state=999)
    cv_results = model_selection.cross_val_score(model, X, Y, cv=kfold, scoring=scoring)
    results_all.append(cv_results)
    names.append(name)
    dict_method_score[name] = (cv_results.mean(), cv_results.std())
    print("{:s}: {:.3f} ({:.3f})".format(name, cv_results.mean(), cv_results.std()))
Run Code Online (Sandbox Code Playgroud)

目的:使用相同的设置更改一些超参数(在交叉验证中为n_splits),然后再检索模型。

Rya*_*ker 5

您可以将多个对象保存到相同的泡菜中:

with open("models.pckl", "wb") as f:
    for model in models:
         pickle.dump(model, f)
Run Code Online (Sandbox Code Playgroud)

然后,您可以一次将一个模型加载回内存中:

models = []
with open("models.pckl", "rb") as f:
    while True:
        try:
            models.append(pickle.load(f))
        except EOFError:
            break
Run Code Online (Sandbox Code Playgroud)