假设我掷了6个骰子60次,我分别得到16,5,9,7,6,15个数字1到6.数字1和6显示太多,并且只有1.8%的可能性是随机的.如果我使用Statistics :: ChiSquare,它会输出:
There's a >1% chance, and a <5% chance, that this data is random.
Run Code Online (Sandbox Code Playgroud)
因此,它不仅是一个糟糕的接口(我不能直接得到这些数字),但舍入误差很大.
更糟糕的是,如果我掷出两个六面骰子怎么办?获得任何特定数字的几率是:
Sum Frequency Relative Frequency
2 1 1/36
3 2 2/36
4 3 3/36
5 4 4/36
6 5 5/36
7 6 6/36
8 5 5/36
9 4 4/36
10 3 3/36
11 2 2/36
12 1 1/36
Run Code Online (Sandbox Code Playgroud)
Statistics :: ChiSquare曾经有一个chisquare_nonuniform()函数,但它被删除了.
所以数字很差,我不能用它来进行非均匀分布.给出一个实际频率列表和预期频率列表,在Perl中计算卡方检验的最佳方法是什么?我在CPAN上找到的各种模块都没有帮助我,所以我猜我错过了一些明显的东西.
amo*_*mon 15
自己实现这一点非常简单,我不想仅为此上传Yet Another Statistics Module.
use Carp qw< croak >;
use List::Util qw< sum >;
use Statistics::Distributions qw< chisqrprob >;
sub chi_squared_test {
my %args = @_;
my $observed = delete $args{observed} // croak q(Argument "observed" required);
my $expected = delete $args{expected} // croak q(Argument "expected" required);
@$observed == @$expected or croak q(Input arrays must have same length);
my $chi_squared = sum map {
($observed->[$_] - $expected->[$_])**2 / $expected->[$_];
} 0 .. $#$observed;
my $degrees_of_freedom = @$observed - 1;
my $probability = chisqrprob($degrees_of_freedom, $chi_squared);
return $probability;
}
say chi_squared_test
observed => [16, 5, 9, 7, 6, 17],
expected => [(10) x 6];
Run Code Online (Sandbox Code Playgroud)
输出: 0.018360
| 归档时间: |
|
| 查看次数: |
1264 次 |
| 最近记录: |