适合泊松直方图

ROB*_*WNS 21 plot numpy matplotlib scipy

我试图在泊松分布的直方图上拟合曲线,看起来像这样 HISTO

我修改了拟合函数,使其类似于泊松分布,参数t作为变量.但是curve_fit函数无法绘制,我不知道为什么.

def histo(bsize):
    N = bsize
    #binwidth
    bw = (dt.max()-dt.min())/(N-1.)
    bin1 = dt.min()+ bw*np.arange(N)
    #define the array to hold the occurrence count
    bincount= np.array([])
    for bin in bin1:
        count = np.where((dt>=bin)&(dt<bin+bw))[0].size
        bincount = np.append(bincount,count)
    #bin center
    binc = bin1+0.5*bw
    plt.figure()
    plt.plot(binc,bincount,drawstyle= 'steps-mid')
    plt.xlabel("Interval[ticks]")
    plt.ylabel("Frequency")
histo(30)
plt.xlim(0,.5e8)
plt.ylim(0,25000)
import numpy as np
from scipy.optimize import curve_fit
delta_t = 1.42e7
def func(x, t):
    return t * np.exp(- delta_t/t) 
popt, pcov = curve_fit(func, np.arange(0,.5e8),histo(30))
plt.plot(popt)
Run Code Online (Sandbox Code Playgroud)

Max*_*Noe 49

代码的问题在于您不知道返回值curve_fit是什么.它是拟合函数及其协方差矩阵的参数.不是你可以直接绘制的东西.

Binned最小二乘拟合

一般来说,你可以获得更多,更轻松的东西:

import numpy as np
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit
from scipy.special import factorial


# get poisson deviated random numbers
data = np.random.poisson(2, 1000)

# the bins should be of integer width, because poisson is an integer distribution
entries, bin_edges, patches = plt.hist(data, bins=11, range=[-0.5, 10.5], normed=True)

# calculate binmiddles
bin_middles = 0.5*(bin_edges[1:] + bin_edges[:-1])

# poisson function, parameter lamb is the fit parameter
def poisson(k, lamb):
    return (lamb**k/factorial(k)) * np.exp(-lamb)

# fit with curve_fit
parameters, cov_matrix = curve_fit(poisson, bin_middles, entries) 

# plot poisson-deviation with fitted parameter
x_plot = np.linspace(0, 20, 1000)

plt.plot(x_plot, poisson(x_plot, *parameters), 'r-', lw=2)
plt.show()
Run Code Online (Sandbox Code Playgroud)

这是结果: 泊松适合

未组合的最大似然拟合

更好的可能性是根本不使用直方图,而是进行最大似然拟合.

但是通过仔细研究,即使这是不必要的,因为泊松分布参数的最大似然估计是算术平均.

但是,如果您有其他更复杂的pdf,可以使用它作为示例:

import numpy as np
import matplotlib.pyplot as plt
from scipy.optimize import minimize
from scipy.special import factorial


def poisson(k, lamb):
    """poisson pdf, parameter lamb is the fit parameter"""
    return (lamb**k/factorial(k)) * np.exp(-lamb)


def negLogLikelihood(params, data):
    """ the negative log-Likelohood-Function"""
    lnl = - np.sum(np.log(poisson(data, params[0])))
    return lnl


# get poisson deviated random numbers
data = np.random.poisson(2, 1000)

# minimize the negative log-Likelihood

result = minimize(negLogLikelihood,  # function to minimize
                  x0=np.ones(1),     # start value
                  args=(data,),      # additional arguments for function
                  method='Powell',   # minimization method, see docs
                  )
# result is a scipy optimize result object, the fit parameters 
# are stored in result.x
print(result)

# plot poisson-deviation with fitted parameter
x_plot = np.linspace(0, 20, 1000)

plt.hist(data, bins=np.arange(15) - 0.5, normed=True)
plt.plot(x_plot, poisson(x_plot, result.x), 'r-', lw=2)
plt.show()
Run Code Online (Sandbox Code Playgroud)

  • 我添加了一个未组合的可能性 (9认同)
  • 这非常适合将曲线拟合到数据点,并且从编程的角度来说,这是对所问问题的正确答案。但是,如果您适合泊松数据,从科学/统计角度来说,最好是适合样本本身,而不是直方图! (3认同)