连续pdf的KL散度

Ame*_*ina 4 python scipy statsmodels pymc

假设我有两个PDF,例如:

from scipy import stats
pdf_y = stats.beta(5, 9).pdf
pdf_x = stats.beta(9, 5).pdf
Run Code Online (Sandbox Code Playgroud)

我想计算他们的KL散度。在我重新发明轮子之前,PyData生态系统中是否有内置插件可以执行此操作?

jse*_*old 5

KL散度位于scipy.stats.entropy中。从文档字符串

stats.entropy(pk, qk=None, base=None) 

Calculate the entropy of a distribution for given probability values.           

If only probabilities `pk` are given, the entropy is calculated as              
``S = -sum(pk * log(pk), axis=0)``.                                             

If `qk` is not None, then compute a relative entropy (also known as             
Kullback-Leibler divergence or Kullback-Leibler distance)                       
``S = sum(pk * log(pk / qk), axis=0)``.  
Run Code Online (Sandbox Code Playgroud)


小智 1

看起来包裹里nimfa有你要找的东西。http://nimfa.biolab.si

V = np.matrix([[1,2,3],[4,5,6],[6,7,8]])
fctr = nimfa.mf(V, method = "lsnmf", max_iter = 10, rank = 3)
fctr_res = nimfa.mf_run(fctr)
# Print the loss function according to Kullback-Leibler divergence. By default Euclidean metric is used.
print "Distance Kullback-Leibler: %5.3e" % fctr_res.distance(metric = "kl")
Run Code Online (Sandbox Code Playgroud)

这并不完全是您正在寻找的,因为它似乎只需要一个输入,但它可能是一个起点。

此外,此链接可能很有用。似乎有一些代码(不是使用 numpy)来计算相同的东西。 https://code.google.com/p/tackbp2011/source/browse/TAC-KBP2011/src/python-utils/LDA/kullback-leibler-divergence.py?r=100