为什么要标准化朴素贝叶斯后验概率

Question

为什么要标准化朴素贝叶斯后验概率

Hit*_*ani 0 machine-learning probability naivebayes data-science

我想了解为什么需要标准化后验。如果我对朴素贝叶斯定理的理解有误，请纠正我。

在公式

P(B|A) = P(A|B)*P(B) / P(A)

RHS 概率是根据训练数据 P(A|B) 计算的，其中 A 是输入特征，B 是目标类别 P(B) 是所考虑的目标类别的概率，P(A) 是输入特征的概率。

一旦计算出这些先验概率，您就可以获得测试数据，并根据测试数据的输入特征计算目标类概率，即 P(B|A)（我猜这称为后验概率）。

现在，在一些视频中，他们教导说，在此之后，您必须对 P(B|A) 进行归一化以获得该目标类别的概率。

为什么这是必要的。P(B|A) 本身不是目标类别的概率吗？

Answer 1

Nik*_*ido 5

原因很简单：

在朴素贝叶斯中，您的目标是找到最大化后验概率的类，因此基本上，您希望Class_j最大化此公式：

Because we have made assumptions of independence, we can translate the P(x|Class_j) numerator part in this way:

Than the numerator in the formula can become something like that:

Because the denominator P(x) is the same for every class, you can basically omit this term in the maximum calculation:

But because the numerator alone does not represent your specific probability (omitting the P(x)), to obtain that you need to divide for that quantity.

Some used refs:

http://shatterline.com/blog/2013/09/12/not-so-naive-classification-with-the-naive-bayes-classifier/ https://www.globalsoftwaresupport.com/naive-bayes-classifier-explained-step-step/

归档时间：	5 年，3 月前
查看次数：	840 次
最近记录：	5 年，2 月前