如何选择类别概率的最佳阈值？

Question

如何选择类别概率的最佳阈值？

lem*_*mon 5 python machine-learning neural-network scikit-learn

我的神经网络的输出是多标签分类的预测类概率表：

print(probabilities)

|   |      1       |      3       | ... |     8354     |     8356     |     8357     |
|---|--------------|--------------|-----|--------------|--------------|--------------|
| 0 | 2.442745e-05 | 5.952136e-06 | ... | 4.254002e-06 | 1.894523e-05 | 1.033957e-05 |
| 1 | 7.685694e-05 | 3.252202e-06 | ... | 3.617730e-06 | 1.613792e-05 | 7.356643e-06 |
| 2 | 2.296657e-06 | 4.859554e-06 | ... | 9.934525e-06 | 9.244772e-06 | 1.377618e-05 |
| 3 | 5.163169e-04 | 1.044035e-04 | ... | 1.435158e-04 | 2.807420e-04 | 2.346930e-04 |
| 4 | 2.484626e-06 | 2.074290e-06 | ... | 9.958628e-06 | 6.002510e-06 | 8.434519e-06 |
| 5 | 1.297477e-03 | 2.211737e-04 | ... | 1.881772e-04 | 3.171079e-04 | 3.228884e-04 |

Run Code Online (Sandbox Code Playgroud)

我使用阈值（0.2）将其转换为类标签来测量我的预测的准确性：

predictions = (probabilities > 0.2).astype(np.int)
print(predictions)

|   | 1 | 3 | ... | 8354 | 8356 | 8357 |
|---|---|---|-----|------|------|------|
| 0 | 0 | 0 | ... |    0 |    0 |    0 |
| 1 | 0 | 0 | ... |    0 |    0 |    0 |
| 2 | 0 | 0 | ... |    0 |    0 |    0 |
| 3 | 0 | 0 | ... |    0 |    0 |    0 |
| 4 | 0 | 0 | ... |    0 |    0 |    0 |
| 5 | 0 | 0 | ... |    0 |    0 |    0 |

Run Code Online (Sandbox Code Playgroud)

我还有一个测试集：

print(Y_test)

|   | 1 | 3 | ... | 8354 | 8356 | 8357 |
|---|---|---|-----|------|------|------|
| 0 | 0 | 0 | ... |    0 |    0 |    0 |
| 1 | 0 | 0 | ... |    0 |    0 |    0 |
| 2 | 0 | 0 | ... |    0 |    0 |    0 |
| 3 | 0 | 0 | ... |    0 |    0 |    0 |
| 4 | 0 | 0 | ... |    0 |    0 |    0 |
| 5 | 0 | 0 | ... |    0 |    0 |    0 |

Run Code Online (Sandbox Code Playgroud)

问题：如何在 Python 中构建一个算法来选择最大化的最佳阈值roc_auc_score(average = 'micro')或其他指标？

也许可以在 Python 中构建手动函数来优化阈值，具体取决于准确性指标。

Answer 1

syl*_*ong 5

我假设你的真实标签是Y_test，预测是predictions。

roc_auc_score(average = 'micro')根据预测进行优化threshold似乎没有意义，因为 AUC 是根据预测的排名方式计算的，因此需要predictions作为中的浮点值[0,1]。

因此，我将讨论accuracy_score。

你可以使用scipy.optimize.fmin：

import scipy
from sklearn.metrics import accuracy_score

def thr_to_accuracy(thr, Y_test, predictions):
   return -accuracy_score(Y_test, np.array(predictions>thr, dtype=np.int))

best_thr = scipy.optimize.fmin(thr_to_accuracy, args=(Y_test, predictions), x0=0.5)

Run Code Online (Sandbox Code Playgroud)

Answer 2

all*_*lee 5

根据@cangrejo的回答： https://stats.stackexchange.com/a/310956/194535，假设你的模型的原始输出概率是向量v，然后你可以定义先验分布：

\n\n

\xcf\x80=(1/\xce\xb81, 1/\xce\xb82,..., 1/\xce\xb8N)，对于 \xce\xb8i\xe2\x88\x88(0,1) 和 \ xce\xa3\xce\xb8i = 1，其中N是标记类的总数，i是类索引。

\n\n

将 v\' = v\xe2\x8a\x99\xcf\x80 作为模型的新输出概率，其中 \xe2\x8a\x99 表示逐元素乘积。

\n\n

现在，您的问题可以重新表述为：从新的输出概率模型中查找优化您指定的指标（例如roc_auc_score）的 \xcf\x80 。一旦找到它，\xce\xb8s（\xce\xb81、\xce\xb82、...、\xce\xb8N）就是每个类别的最佳阈值。

\n\n

代码部分：

\n\n

创建一个proxyModel类，将原始模型对象作为参数并返回一个proxyModel对象。当您predict_proba()通过proxyModel对象调用时，它将根据您指定的阈值自动计算新的概率：

\n\n

class proxyModel():\n    def __init__(self, origin_model):\n        self.origin_model = origin_model\n\n    def predict_proba(self, x, threshold_list=None):\n        # get origin probability\n        ori_proba = self.origin_model.predict_proba(x)\n\n        # set default threshold\n        if threshold_list is None:\n            threshold_list = np.full(ori_proba[0].shape, 1)\n\n        # get the output shape of threshold_list\n        output_shape = np.array(threshold_list).shape\n\n        # element-wise divide by the threshold of each classes\n        new_proba = np.divide(ori_proba, threshold_list)\n\n        # calculate the norm (sum of new probability of each classes)\n        norm = np.linalg.norm(new_proba, ord=1, axis=1)\n\n        # reshape the norm\n        norm = np.broadcast_to(np.array([norm]).T, (norm.shape[0],output_shape[0]))\n\n        # renormalize the new probability\n        new_proba = np.divide(new_proba, norm)\n\n        return new_proba\n\n    def predict(self, x, threshold_list=None):\n        return np.argmax(self.predict_proba(x, threshold_list), axis=1)\n

Run Code Online (Sandbox Code Playgroud)

实现评分函数：

\n\n

def scoreFunc(model, X, y_true, threshold_list):\n    y_pred = model.predict(X, threshold_list=threshold_list)\n    y_pred_proba = model.predict_proba(X, threshold_list=threshold_list)\n\n    ###### metrics ######\n    from sklearn.metrics import accuracy_score\n    from sklearn.metrics import roc_auc_score\n    from sklearn.metrics import average_precision_score\n    from sklearn.metrics import f1_score\n\n    accuracy = accuracy_score(y_true, y_pred)\n    roc_auc = roc_auc_score(y_true, y_pred_proba, average=\'macro\')\n    pr_auc = average_precision_score(y_true, y_pred_proba, average=\'macro\')\n    f1_value = f1_score(y_true, y_pred, average=\'macro\')\n\n    return accuracy, roc_auc, pr_auc, f1_value\n\n

Run Code Online (Sandbox Code Playgroud)

定义weighted_score_with_threshold()函数，以阈值作为输入并返回加权分数：

\n\n

def weighted_score_with_threshold(threshold, model, X_test, Y_test, metrics=\'accuracy\', delta=5e-5):\n    # if the sum of thresholds were not between 1+delta and 1-delta, \n    # return infinity (just for reduce the search space of the minimizaiton algorithm, \n    # because the sum of thresholds should be as close to 1 as possible).\n    threshold_sum = np.sum(threshold)\n\n    if threshold_sum > 1+delta:\n        return np.inf\n\n    if threshold_sum < 1-delta:\n        return np.inf\n\n    # to avoid objective function jump into nan solution\n    if np.isnan(threshold_sum):\n        print("threshold_sum is nan")\n        return np.inf\n\n    # renormalize: the sum of threshold should be 1\n    normalized_threshold = threshold/threshold_sum\n\n    # calculate scores based on thresholds\n    # suppose it\'ll return 4 scores in a tuple: (accuracy, roc_auc, pr_auc, f1)\n    scores = scoreFunc(model, X_test, Y_test, threshold_list=normalized_threshold)    \n\n    scores = np.array(scores)\n    weight = np.array([1,1,1,1])\n\n    # Give the metric you want to maximize a bigger weight:\n    if metrics == \'accuracy\':\n        weight = np.array([10,1,1,1])\n    elif metrics == \'roc_auc\':\n        weight = np.array([1,10,1,1])\n    elif metrics == \'pr_auc\':\n        weight = np.array([1,1,10,1])\n    elif metrics == \'f1\':\n        weight = np.array([1,1,1,10])\n    elif \'all\':\n        weight = np.array([1,1,1,1])\n\n    # return negatitive weighted sum (because you want to maximize the sum, \n    # it\'s equivalent to minimize the negative sum)\n    return -np.dot(weight, scores)\n

Run Code Online (Sandbox Code Playgroud)

使用优化算法differential_evolution()（比 fmin 更好）找到最佳阈值：

\n\n

from scipy import optimize\n\noutput_class_num = Y_test.shape[1]\nbounds = optimize.Bounds([1e-5]*output_class_num,[1]*output_class_num)\n\npmodel = proxyModel(model)\n\nresult = optimize.differential_evolution(weighted_score_with_threshold, bounds, args=(pmodel, X_test, Y_test, \'accuracy\'))\n\n# calculate threshold\nthreshold = result.x/np.sum(result.x)\n\n# print the optimized score\nprint(scoreFunc(model, X_test, Y_test, threshold_list=threshold))\n\n

Run Code Online (Sandbox Code Playgroud)

\n

归档时间：	7 年，5 月前
查看次数：	9408 次
最近记录：	5 年，5 月前