fco*_*col 3 python scipy scikit-learn
如何腌制或保存 scipy kde 以供以后使用?
import scipy.stats as scs
from sklearn.externals import joblib
kde = scs.gaussian_kde(data, bw_method=.15)
joblib.dump(kde, 'test.pkl')
Run Code Online (Sandbox Code Playgroud)
我尝试了上面并收到此错误:
PicklingError: Can't pickle <function gaussian_kde.set_bandwidth.<locals>.<lambda> at 0x1a5b6fb7b8>: it's not found as scipy.stats.kde.gaussian_kde.set_bandwidth.<locals>.<lambda>
Run Code Online (Sandbox Code Playgroud)
看起来 joblib 在使用该set_bandwith方法时遇到了问题,我的猜测是因为lambda该方法中的函数 - pickling lambdas 已在此处讨论过。
with open('test.pkl', 'wb') as fo:
joblib.dump(lambda x,y: x+y, fo)
PicklingError: Can't pickle <function <lambda> at 0x7ff89495d598>: it's not found as __main__.<lambda>
Run Code Online (Sandbox Code Playgroud)
据我所知,cloudpickle和dill都可以工作:
import cloudpickle
import dill
with open('test.cp.pkl', 'wb') as f:
cloudpickle.dump(kde, f)
with open('test.dill.pkl', 'wb') as f:
dill.dump(kde, f)
with open('test.cp.pkl', 'rb') as f:
kde_cp = cloudpickle.load(f)
with open('test.dill.pkl', 'rb') as f:
kde_dill = dill.load(f)
Run Code Online (Sandbox Code Playgroud)
检查一些数据:
import numpy as np
print(np.array_equal(kde.dataset, kde_cp.dataset))
True
print(np.array_equal(kde.dataset, kde_dill.dataset))
True
print(np.array_equal(kde_cp.dataset, kde_dill.dataset))
True
kde.pdf(10) == kde_cp.pdf(10) == kde_dill.pdf(10)
array([ True])
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
1573 次 |
| 最近记录: |