Dan*_*ice 5 scoring scikit-learn cross-validation keras grid-search
对来自 Keras 模型的 Multiclass 输出使用自定义评分会为 cross_val_score 或 GridSearchCV 返回相同的错误,如下所示(它在 Iris 上,因此您可以直接运行它进行测试):
import numpy as np
from sklearn import datasets
from sklearn.model_selection import train_test_split, cross_val_score, GridSearchCV
from keras.models import Sequential
from keras.layers import Dense
from keras.utils import to_categorical
from keras.wrappers.scikit_learn import KerasClassifier
iris = datasets.load_iris()
X= iris.data
Y = to_categorical(iris.target)
X_train, X_test, Y_train, Y_test = train_test_split(X, Y, train_size=0.8, random_state=1000)
def create_model(optimizer='rmsprop'):
model = Sequential()
model.add(Dense(8,activation='relu',input_shape = (4,)))
model.add(Dense(3,activation='softmax'))
model.compile(optimizer = optimizer,
loss='categorical_crossentropy',
metrics=['accuracy'])
return model
model = KerasClassifier(build_fn=create_model,
epochs=10,
batch_size=5,
verbose=0)
#results = cross_val_score(model, X_train, Y_train, scoring='precision_macro')
param_grid = {'optimizer':('rmsprop','adam')}
grid = GridSearchCV(model,
param_grid=param_grid,
return_train_score=True,
scoring=['accuracy','precision_macro','recall_macro'],
refit='precision_macro')
grid_results = grid.fit(X_train,Y_train)
Run Code Online (Sandbox Code Playgroud)
所以我得到这个错误
我绕过了整个堆栈,因为您可以通过复制上面的代码来重现它。
ValueError: Classification metrics can't handle a mix of multilabel-indicator and binary targets
Run Code Online (Sandbox Code Playgroud)
当我删除评分参数时,它起作用了。
有什么办法可以避免这种情况并启用 f1、精度或任何自定义分数?当然,不必重写我自己的网格搜索代码。
谢谢你的帮助
更新:我刚刚找到了解决方法
首先这个文档 ( http://scikit-learn.org/stable/modules/multiclass.html#multilabel-classification-format ) 表明 Keras 中使用的 one-hot 表示在 scikit-learn 中被解释为多标签。
然后看看scikit_learn.py实现 KerasClassifier 类:https : //github.com/keras-team/keras/blob/master/keras/wrappers/scikit_learn.py
BaseWrapper 类中的 fit 函数包括这行代码:
if loss_name == 'categorical_crossentropy' and len(y.shape) != 2:
y = to_categorical(y)
Run Code Online (Sandbox Code Playgroud)
Wrapper 自己进行分类转换。
Keras 似乎为了避免这个问题,由于多类表示与 scikit-learn 的不同,可以采用 scikit-learn 风格的多类[0,1,2,1,0,2]并将其转换为仅用于 NN 模型拟合的分类表示。
因此,在将模型传递给 sklearn 函数时,我只是尝试删除分类转换。
它现在工作
import numpy as np
from sklearn import datasets
from sklearn.model_selection import train_test_split, cross_val_score, GridSearchCV
from keras.models import Sequential
from keras.layers import Dense, Activation
from keras.utils import to_categorical
from keras.wrappers.scikit_learn import KerasClassifier
iris = datasets.load_iris()
X= iris.data
#Y = to_categorical(iris.target,3)
Y = iris.target
X_train, X_test, Y_train, Y_test = train_test_split(X, Y, train_size=0.8, random_state=1000)
def create_model(optimizer='rmsprop'):
model = Sequential()
model.add(Dense(8,activation='relu',input_shape = (4,)))
model.add(Dense(3,activation='softmax'))
model.compile(optimizer = optimizer,
loss='categorical_crossentropy',
metrics=['accuracy'])
return model
model = KerasClassifier(build_fn=create_model,
epochs=10,
batch_size=5,
verbose=0)
#results = cross_val_score(model, X_train, Y_train, scoring='precision_macro')
param_grid = {'optimizer':('rmsprop','adam')}
grid = GridSearchCV(model,
param_grid=param_grid,
return_train_score=True,
scoring=['precision_macro','recall_macro','f1_macro'],
refit='precision_macro')
grid_results = grid.fit(X_train,Y_train)
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
1144 次 |
| 最近记录: |