Ran*_*ter 2 javascript testing statistics google-bigquery
我试图使用JavaScript statistics.js库在bigquery中运行配对t检验,我得到的数字不一致.
我的代码如下;
CREATE TEMPORARY FUNCTION ttest(a ARRAY<FLOAT64>, b ARRAY<FLOAT64>)
RETURNS FLOAT64
LANGUAGE js AS
"""
return statistical.methods.tTestTwoSample(a, b);
//statistical.methods.tTestTwoSample(sample1: Array, sample2: Array)
"""
OPTIONS (
library="gs://my_bucket/statistical.js"
);
SELECT
ttest([9.96, 3.76, 1.17, 8.66, 5.25, 7.61, 5.8, 1.84, 7.06, 9.4, 2.99, 9.3, 9.01, 4.24, 3.52, 9.6, 7.59, 6.99, 9.62, 2.18] ,
[3.96, 5.76, 7.17, 7.66, 9.25, 3.61, 4.8, 8.84, 6.06, 4.4, 1.99, 8.3, 9.01, 1.24, 5.52, 8.6, 5.59, 1.99, 7.62, 3.18] ) ttest
Run Code Online (Sandbox Code Playgroud)
随着gs://my_bucket/statistical.jsJS包保存在谷歌存储中的位置.
这个包可以在这里找到 https://github.com/adrien2p/statistical-js
两个样本t检验statistical.methods.tTestTwoSample(sample1: Array, sample2: Array)应返回p值.我无法确定测试是否配对或其他方式.当我运行代码时,我得到4.099,据我所知,这不是一个p值.我已经在Excel中运行测试以验证结果使用,=T.TEST()并且在配对的双面测试中得到p值为0.4871.我还尝试了除配对之外的其他t检验,并尝试将4.099结果视为T统计,而没有数字匹配的运气.
所以我的问题是,如何在BigQuery中运行配对t检验得到的p值?我应该使用其他套餐吗?
在此先感谢您的帮助.
没有太多工作,我决定手动计算t统计量并使用jStat库(可用github.com/jstat/jstat)将统计量转换为p值.
我的代码如下;
CREATE TEMPORARY FUNCTION tscore_to_p(a FLOAT64, b FLOAT64, c FLOAT64)
RETURNS FLOAT64
LANGUAGE js AS
"""
return jStat.ttest(a,b,c); //jStat.ttest( tscore, n, sides)
"""
OPTIONS (
library="gs://my_bucket/jstat.min.js"
);
WITH data AS (
SELECT * FROM
(SELECT 9.96 AS A, 3.96 AS B) UNION ALL
(SELECT 3.76 AS A, 5.76 AS B) UNION ALL
(SELECT 1.17 AS A, 7.17 AS B) UNION ALL
(SELECT 8.66 AS A, 7.66 AS B) UNION ALL
(SELECT 5.25 AS A, 9.25 AS B) UNION ALL
(SELECT 7.61 AS A, 3.61 AS B) UNION ALL
(SELECT 5.80 AS A, 4.80 AS B) UNION ALL
(SELECT 1.84 AS A, 8.84 AS B) UNION ALL
(SELECT 7.06 AS A, 6.06 AS B) UNION ALL
(SELECT 9.40 AS A, 4.40 AS B) UNION ALL
(SELECT 2.99 AS A, 1.99 AS B) UNION ALL
(SELECT 9.30 AS A, 8.30 AS B) UNION ALL
(SELECT 9.01 AS A, 9.01 AS B) UNION ALL
(SELECT 4.24 AS A, 1.24 AS B) UNION ALL
(SELECT 3.52 AS A, 5.52 AS B) UNION ALL
(SELECT 9.60 AS A, 8.60 AS B) UNION ALL
(SELECT 7.59 AS A, 5.59 AS B) UNION ALL
(SELECT 6.99 AS A, 1.99 AS B) UNION ALL
(SELECT 9.62 AS A, 7.62 AS B) UNION ALL
(SELECT 2.18 AS A, 3.18 AS B)
)
SELECT
COUNT(*) n
, COUNT(*)-1 dof
, AVG(difference) mean
, STDDEV_SAMP(difference) SD
, STDDEV_SAMP(difference)/SQRT(COUNT(*)) SE
, AVG(difference)/ (STDDEV_SAMP(difference)/SQRT(COUNT(*))) t
, tscore_to_p((AVG(difference)/ (STDDEV_SAMP(difference)/SQRT(COUNT(*)))) COUNT(*), 2) p_value
FROM
(SELECT *, (A-B) difference FROM data)
Run Code Online (Sandbox Code Playgroud)
这导致p值为0.4871,与在Excel中运行的配对t检验的p值相同.
感谢任何考虑过这个问题的人.
| 归档时间: |
|
| 查看次数: |
613 次 |
| 最近记录: |