ali*_*rat 2 amazon-web-services presto amazon-athena
我正在使用AWS Athena来计算一些指标。我有一个像这样的数据集:
sessionumber
0
10
-1
10
2
-10
10
我正在尝试计算该值的百分位数,但仅针对有效值的子集。有效值是,sessionnumber > 1因此我尝试了以下操作:
with testfun AS
(SELECT filter(array_agg(sessionnumber), x -> x >= 1) as validvalues
FROM "mydate")
SELECT (percentiles(validvalues, 0.25) FROM testfun
Run Code Online (Sandbox Code Playgroud)
但是我遇到了以下错误:
SYNTAX_ERROR: line 17:10: Unexpected parameters (array(integer), double) for function approx_percentile. Expected: approx_percentile(bigint, double) , approx_percentile(bigint, bigint, double) , approx_percentile(bigint, bigint, double, double) , approx_percentile(bigint, array(double)) , approx_percentile(bigint, bigint, array(double)) , approx_percentile(double, double) , approx_percentile(double, bigint, double, double) , approx_percentile(double, bigint, double) , approx_percentile(double, array(double)) , approx_percentile(double, bigint, array(double)) , approx_percentile(real, double) , approx_percentile(real, bigint, double, double) , approx_percentile(real, bigint, double) , approx_percentile(real, array(double)) , approx_percentile(real, bigint, array(double))
Run Code Online (Sandbox Code Playgroud)
我理解了我的错误,但找不到解决方法来修复AWS Athena / PrestoDB。这样做有可能吗?
我找到了解决方法,并在这里分享:
WITH validValues AS
(SELECT approx_percentile(sessionnumber, ARRAY[0.25,0.50,0.75,0.95, 0.99]) as percentiles from (SELECT sessionnumber from "20180407" where sessionnumber >= 1))
SELECT percentiles FROM testfun, validValues
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
1045 次 |
| 最近记录: |