适合高斯函数

25 python matplotlib histogram curve-fitting scipy

我有一个直方图(见下文),我试图找到平均值和标准偏差以及符合我的直方图曲线的代码.我认为SciPy或matplotlib中有一些东西可以提供帮助,但我尝试的每个例子都不起作用.

import matplotlib.pyplot as plt
import numpy as np

with open('gau_b_g_s.csv') as f:
    v = np.loadtxt(f, delimiter= ',', dtype="float", skiprows=1, usecols=None)

fig, ax = plt.subplots()

plt.hist(v, bins=500, color='#7F38EC', histtype='step')

plt.title("Gaussian")
plt.axis([-1, 2, 0, 20000])

plt.show()
Run Code Online (Sandbox Code Playgroud)

Chr*_*ris 38

看一下这个答案,将任意曲线拟合到数据中.基本上,您可以使用scipy.optimize.curve_fit任何您想要的数据功能.下面的代码显示了如何将Gaussian拟合到一些随机数据(归功于 SciPy用户邮件列表帖子).

import numpy
from scipy.optimize import curve_fit
import matplotlib.pyplot as plt

# Define some test data which is close to Gaussian
data = numpy.random.normal(size=10000)

hist, bin_edges = numpy.histogram(data, density=True)
bin_centres = (bin_edges[:-1] + bin_edges[1:])/2

# Define model function to be used to fit to the data above:
def gauss(x, *p):
    A, mu, sigma = p
    return A*numpy.exp(-(x-mu)**2/(2.*sigma**2))

# p0 is the initial guess for the fitting coefficients (A, mu and sigma above)
p0 = [1., 0., 1.]

coeff, var_matrix = curve_fit(gauss, bin_centres, hist, p0=p0)

# Get the fitted curve
hist_fit = gauss(bin_centres, *coeff)

plt.plot(bin_centres, hist, label='Test data')
plt.plot(bin_centres, hist_fit, label='Fitted data')

# Finally, lets get the fitting parameters, i.e. the mean and standard deviation:
print 'Fitted mean = ', coeff[1]
print 'Fitted standard deviation = ', coeff[2]

plt.show()
Run Code Online (Sandbox Code Playgroud)

  • 我怀疑@ user1496646意味着,在他的情况下,没有那么多<bin_centres>,所以当你绘制(bin_centres,hist_fit)时,它出现了很差的采样高斯("胡萝卜").他应该使用new_bin_centers = numpy.linspace(bin_centres [0],bin_centres [-1],200),new_hist_fit = gauss(new_bin_centres,*coeff)和plot(new_bin_centres,new_hist_fit)对bin_centers进行二次采样. (3认同)

Nic*_*bey 14

您可以尝试sklearn高斯混合模型估计如下:

import numpy as np
import sklearn.mixture

gmm = sklearn.mixture.GMM()

# sample data
a = np.random.randn(1000)

# result
r = gmm.fit(a[:, np.newaxis]) # GMM requires 2D data as of sklearn version 0.16
print("mean : %f, var : %f" % (r.means_[0, 0], r.covars_[0, 0]))
Run Code Online (Sandbox Code Playgroud)

参考:http://scikit-learn.org/stable/modules/mixture.html#mixture

请注意,通过这种方式,您无需使用直方图估计样本分布.