Sim*_*ity 5 statistics matlab distribution histogram
在MATLAB中有没有办法检查直方图分布是单峰还是双峰?
编辑
你认为Hartigan的Dip统计会起作用吗?我尝试将图像传递给它,并获得值0.那是什么意思?
并且,当传递图像时,它是否测试灰度级图像直方图的分布?
谢谢.
这是一个使用Nic Price实施Hartigan's Dip测试来识别单峰分布的脚本.棘手的一点是计算xpdf,这不是概率密度函数,而是一个有序的样本.
p_value假设零假设为真,则获得测试统计量的概率至少与实际观察到的一样极端.在这种情况下,零假设是分布是单峰的.
close all; clear all;
function [x2, n, b] = compute_xpdf(x)
x2 = reshape(x, 1, prod(size(x)));
[n, b] = hist(x2, 40);
% This is definitely not probability density function
x2 = sort(x2);
% downsampling to speed up computations
x2 = interp1 (1:length(x2), x2, 1:1000:length(x2));
end
nboot = 500;
sample_size = [256 256];
% Unimodal
sample2d = normrnd(0.0, 10.0, sample_size);
[xpdf, n, b] = compute_xpdf(sample2d);
[dip, p_value, xlow, xup] = HartigansDipSignifTest(xpdf, nboot);
figure;
subplot(1,2,1);
bar(n, b)
title(sprintf('Probability of unimodal %.2f', p_value))
% Bimodal
sample2d = sign(sample2d) .* (abs(sample2d) .^ 0.5);
[xpdf, n, b] = compute_xpdf(sample2d);
[dip, p_value, xlow, xup] = HartigansDipSignifTest(xpdf, nboot);
subplot(1,2,2);
bar(n, b)
title(sprintf('Probability of unimodal %.2f', p_value))
print -dpng modality.png
Run Code Online (Sandbox Code Playgroud)

有很多不同的方法可以满足您的要求。从最字面意义上讲,“双峰”意味着有两个峰。但通常情况下,您希望“两个峰值”分开一定的合理距离,并且您希望它们各自包含总计数的合理比例。只有您知道什么对于您的情况是“合理的”,但以下方法可能会有所帮助。
\n\ncumsum您必须决定该数量的大小对您来说代表“双峰”。这是一些代码,演示了我正在谈论的内容。它生成不同严重程度的双峰分布 - 两个高斯分布,它们之间的增量不断增加(步长 = 标准差的大小)。我计算上述数量,并将其绘制为一系列不同值delta。然后,我在对应于整个分布的 ± 1 sigma 的范围内通过该曲线拟合抛物线。正如您所看到的,当分布变得更加双峰时,会发生两件事:
您可以查看您自己的某些分布的这些数量,并决定将截止值放在哪里。
\n\n% test for bimodal distribution\nclose all\nfor delta = 0:10:50\n a1 = randn(100,100) * 10 + 25;\n a2 = randn(100,100) * 10 + 25 + delta;\n a3 = [a1(:); a2(:)];\n [h hb] = hist(a3, 0:100);\n cs = cumsum(h);\n llimi = find(cs < 0.2 * max(cs(:)));\n ulimi = find(cs > 0.8 * max(cs(:)));\n llim = hb(llimi(end));\n ulim = hb(ulimi(1));\n cuts = linspace(llim, ulim, 20);\n dmean = mean(a3);\n dstd = std(a3);\n for ci = 1:numel(cuts)\n d1 = a3(a3<cuts(ci));\n d2 = a3(a3>=cuts(ci));\n m(ci,1) = mean(d1);\n m(ci, 2) = mean(d2);\n s(ci, 1) = std(d1);\n s(ci, 2) = std(d2);\n end\n q = (m(:, 2) - m(:, 1)) ./ sum(s, 2);\n figure; \n plot(cuts, q);\n title(sprintf(\'delta = %d\', delta))\n % compute curvature of plot around mean:\n xlims = dmean + [-1 1] * dstd;\n indx = find(cuts < xlims(2) && cuts > xlims(1));\n pf = polyfit(cuts(indx), q(indx), 2);\n m = polyval(pf, dmean);\n fprintf(1, \'coefficients: a = %.2e, peak = %.2f\\n\', pf(1), m);\nend\nRun Code Online (Sandbox Code Playgroud)\n\n输出值:
\n\ncoefficients: a = 1.37e-03, peak = 1.32\ncoefficients: a = 1.01e-03, peak = 1.34\ncoefficients: a = 2.85e-04, peak = 1.45\ncoefficients: a = -5.78e-04, peak = 1.70\ncoefficients: a = -1.29e-03, peak = 2.08\ncoefficients: a = -1.58e-03, peak = 2.48\nRun Code Online (Sandbox Code Playgroud)\n\n示例图:
\n\n

以及 delta = 40 的直方图:
\n\n