实际上有很多关于持久性的问题,但我已经尝试了很多使用pickle
或joblib.dumps
.但当我用它来保存我的随机森林时我得到了这个:
ValueError: ("Buffer dtype mismatch, expected 'SIZE_t' but got 'long'", <type 'sklearn.tree._tree.ClassificationCriterion'>, (1, array([10])))
Run Code Online (Sandbox Code Playgroud)
谁能告诉我为什么?
一些代码供审查
forest = RandomForestClassifier()
forest.fit(data[:n_samples], target[:n_samples ])
import cPickle
with open('rf.pkl', 'wb') as f:
cPickle.dump(forest, f)
with open('rf.pkl', 'rb') as f:
forest = cPickle.load(f)
Run Code Online (Sandbox Code Playgroud)
要么
from sklearn.externals import joblib
joblib.dump(forest,'rf.pkl')
from sklearn.externals import joblib
forest = joblib.load('rf.pkl')
Run Code Online (Sandbox Code Playgroud) 对于网格搜索总是很耗时,所以我想看看它现在运行了多少.例如,它可能会输出
paramsXXX processed
paramsYYY processed
...
Run Code Online (Sandbox Code Playgroud)