keras 和 scikit-learn 计算精度的差异

R t*_*ype 4 machine-learning scikit-learn multilabel-classification conv-neural-network keras

我目前正在使用 keras 中的 CNN 进行多标签图像分类。\n除了 keras 的准确性之外,我们还使用各种评估方法(召回率、精度、F1 分数和准确性)重新确认了 scikit-learn 的准确性。

\n

我们发现keras计算的准确率约为90%,而scikit-learn仅显示60%左右。

\n

我不知道为什么会发生这种情况,所以请告诉我。

\n

keras计算有问题吗?

\n

我们使用 sigmoid 作为激活函数、binary_crossentropy损失函数,使用 adam 作为优化器。

\n
\n

Keras 训练

\n
input_tensor = Input(shape=(img_width, img_height, 3))\n\nbase_model = MobileNetV2(include_top=False, weights='imagenet')\n\n#model.summary()\n\nx = base_model.output\nx = GlobalAveragePooling2D()(x)\n#x = Dense(2048, activation='relu')(x)\n#x = Dropout(0.5)(x)\nx = Dense(1024, activation = 'relu')(x)\n\nx = Dropout(0.5)(x)\npredictions = Dense(6, activation = 'sigmoid')(x)\n\nfor layer in base_model.layers:\n    layer.trainable = False\n\n\nmodel = Model(inputs = base_model.input, outputs = predictions)\nprint("{}\xe5\xb1\xa4".format(len(model.layers)))\n\n\nmodel.compile(optimizer=sgd, loss="binary_crossentropy", metrics=["acc"])\n\nhistory = model.fit(X_train, y_train, epochs=50, validation_data=(X_val, y_val), batch_size=64, verbose=2)\n\nmodel_evaluate()\n
Run Code Online (Sandbox Code Playgroud)\n

Keras 显示 90%(准确率)。

\n
\n

scikit-learn 检查

\n
 from sklearn.metrics import precision_score, recall_score, f1_score, accuracy_score\nthresholds=[0.1,0.2,0.3,0.4,0.5,0.6,0.7,0.8,0.9]\n\ny_pred = model.predict(X_test)\n\nfor val in thresholds:\n    print("For threshold: ", val)\n    pred=y_pred.copy()\n  \n    pred[pred>=val]=1\n    pred[pred<val]=0\n    \n    accuracy = accuracy_score(y_test, pred)\n    precision = precision_score(y_test, pred, average='micro')\n    recall = recall_score(y_test, pred, average='micro')\n    f1 = f1_score(y_test, pred, average='micro')\n   \n    print("Micro-average quality numbers")\n    print("Acc: {:.4f}, Precision: {:.4f}, Recall: {:.4f}, F1-measure: {:.4f}".format(accuracy, precision, recall, f1))\n
Run Code Online (Sandbox Code Playgroud)\n
\n

输出(scikit-learn)

\n
  For threshold:  0.1\nMicro-average quality numbers\nAcc: 0.0727, Precision: 0.3776, Recall: 0.8727, F1-measure: 0.5271\nFor threshold:  0.2\nMicro-average quality numbers\nAcc: 0.1931, Precision: 0.4550, Recall: 0.8033, F1-measure: 0.5810\nFor threshold:  0.3\nMicro-average quality numbers\nAcc: 0.3323, Precision: 0.5227, Recall: 0.7403, F1-measure: 0.6128\nFor threshold:  0.4\nMicro-average quality numbers\nAcc: 0.4574, Precision: 0.5842, Recall: 0.6702, F1-measure: 0.6243\nFor threshold:  0.5\nMicro-average quality numbers\nAcc: 0.5059, Precision: 0.6359, Recall: 0.5858, F1-measure: 0.6098\nFor threshold:  0.6\nMicro-average quality numbers\nAcc: 0.4597, Precision: 0.6993, Recall: 0.4707, F1-measure: 0.5626\nFor threshold:  0.7\nMicro-average quality numbers\nAcc: 0.3417, Precision: 0.7520, Recall: 0.3383, F1-measure: 0.4667\nFor threshold:  0.8\nMicro-average quality numbers\nAcc: 0.2205, Precision: 0.7863, Recall: 0.2132, F1-measure: 0.3354\nFor threshold:  0.9\nMicro-average quality numbers\nAcc: 0.1063, Precision: 0.8987, Recall: 0.1016, F1-measure: 0.1825\n
Run Code Online (Sandbox Code Playgroud)\n

Quw*_*Ohi 5

在多标签分类的情况下,可能有两种类型的正确答案。

  1. 如果所有子标签的预测都是正确的。示例:在演示数据集中y_true,有 5 个输出。其中y_pred3 个是完全正确的。在这种情况下,准确度应该是60%

  2. 如果我们还考虑多标签分类的子标签,那么准确率就会改变。示例:演示数据集y_true总共包含 15 个预测。y_pred正确预测其中 10 个。在这种情况下,准确度应该是66.7%

SkLearn 按照第 1 点中所述处理多标签分类。而 Keras 准确度指标遵循第 2 点中所述的方法。下面给出了代码示例。

代码:

import tensorflow as tf
from sklearn.metrics import accuracy_score
import numpy as np

# A demo dataset 
y_true = np.array([[0, 1, 0], [1, 0, 0], [1, 1, 1], [0, 0, 0], [1, 0, 1]])
y_pred = np.array([[1, 0, 0], [1, 0, 0], [0, 0, 0], [0, 0, 0], [1, 0, 1]])

kacc = tf.keras.metrics.Accuracy()
_ = kacc.update_state(y_true, y_pred)
print(f'Keras Accuracy acc: {kacc.result().numpy()*100:.3}')

kbacc = tf.keras.metrics.BinaryAccuracy()
_ = kbacc.update_state(y_true, y_pred)
print(f'Keras BinaryAccuracy acc: {kbacc.result().numpy()*100:.3}')

print(f'SkLearn acc: {accuracy_score(y_true, y_pred)*100:.3}')
Run Code Online (Sandbox Code Playgroud)

输出:

Keras Accuracy acc: 66.7
Keras BinaryAccuracy acc: 66.7
SkLearn acc: 60.0
Run Code Online (Sandbox Code Playgroud)

因此,您必须选择任何一个选项。如果您选择使用方法 1,则必须手动实施准确度指标。然而,多标签训练通常是使用损失sigmoid来完成的binary_crossentropy。在方法2的基础上将损失降到binary_crossentropy最低。因此,您也应该遵循方法2。