use*_*827 10 python ensemble-learning mlxtend
我想merf在集成模型中使用(混合效应随机森林)库,例如通过使用mlens或mlxtendpython 库。然而,由于拟合和预测方法的merf结构采用非传统方式,我无法弄清楚如何做到这一点:
from merf import MERF
merf = MERF()
merf.fit(X_train, Z_train, clusters_train, y_train)
y_hat = merf.predict(X_test, Z_test, clusters_test)
Run Code Online (Sandbox Code Playgroud)
有没有办法merf在集成模型中使用该库?问题在于,使用mlens或其他集成库构建集成模型会假定 scikit-learn 结构,其中 fit 方法将X,y作为输入,预测方法将 ,X作为输入。然而,merf显然在拟合和预测方法中都有更多的输入。这是一个简化的语法mlens:
from mlens.ensemble import SuperLearner
ensemble = SuperLearner()
ensemble.add(estimators)
ensemble.add_meta(meta_estimator)
ensemble.fit(X, y).predict(X)
Run Code Online (Sandbox Code Playgroud)
我不限于使用mlens或mlxten。任何其他构建集成模型的方法merf也可以。
Dia*_*ost -1
我的意思是,您始终可以使用 :P 潜入数据制作过程merf。大部分数据生成来自流形 merf 示例:
from merf.utils import MERFDataGenerator
import numpy as np
from mlens.ensemble import SuperLearner
from sklearn.svm import SVR
from sklearn.linear_model import Lasso
from mlens.metrics.metrics import rmse
dgm = MERFDataGenerator(m = .6, sigma_b = np.sqrt(4.5), sigma_e = 1)
num_clusters_each_size = 20
train_sizes = [1, 3, 5, 7, 9]
known_sizes = [9, 27, 45, 63, 81]
new_sizes = [10, 30, 50, 70, 90]
train_cluster_sizes = MERFDataGenerator.create_cluster_sizes_array(train_sizes, num_clusters_each_size)
known_cluster_sizes = MERFDataGenerator.create_cluster_sizes_array(known_sizes, num_clusters_each_size)
new_cluster_sizes = MERFDataGenerator.create_cluster_sizes_array(new_sizes, num_clusters_each_size)
train, test_known, test_new, training_cluster_ids, ptev, prev = dgm.generate_split_samples(train_cluster_sizes, known_cluster_sizes, new_cluster_sizes)
X_train = train[['X_0', 'X_1', 'X_2']]
Z_train = train[['Z']]
clusters_train = train['cluster']
y_train = train['y']
Run Code Online (Sandbox Code Playgroud)
在通过Flennerhag mlens.ensemble superlearner.py(Github)进行一些修改进行拟合和预测之前:
ensemble = SuperLearner()
ensemble.add([SVR(), Lasso()])
ensemble.add_meta(SVR())
pred = ensemble.fit(X_train, y_train).predict(X_train)
root = rmse(y_train, pred)
print(root)
>>>
2.345318341087564
Run Code Online (Sandbox Code Playgroud)
但当然,如果您不介意专门将merf和ensemble一起使用,那么总体上总有更好的方法。
Keras方法from keras.models import Sequential
from keras.layers import Dense
from matplotlib import pyplot
from keras import backend
import matplotlib.pyplot as plt
import numpy as np
def rmse(y_true, y_pred):
return backend.sqrt(backend.mean(backend.square(y_pred - y_true), axis=-1))
X = X_train.to_numpy().flatten()
model = Sequential()
model.add(Dense(2, input_dim=1, activation='relu'))
model.add(Dense(1))
model.compile(loss='mse', optimizer='adam', metrics=[rmse])
history = model.fit(X, X, epochs=500, batch_size=len(X), verbose=2)
plt.plot(history.history['rmse'])
plt.title("keras loss function")
plt.show()
>>>
Run Code Online (Sandbox Code Playgroud)
请注意,X_train此处使用的内容来自之前的merf代码:
X_train = train[['X_0', 'X_1', 'X_2']]
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
411 次 |
| 最近记录: |