在我的CENSUS表中,我想按国家分组,并且每个州获得县中位数和县的数量.
在psql,redshift和snowflake中,我可以这样做:
psql=> SELECT state, count(county), PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY "population2000") AS median FROM CENSUS GROUP BY state;
state | count | median
----------------------+-------+----------
Alabama | 67 | 36583
Alaska | 24 | 7296.5
Arizona | 15 | 116320
Arkansas | 75 | 20229
...
Run Code Online (Sandbox Code Playgroud)
我试图在标准的BigQuery中找到一个很好的方法来做到这一点.我注意到有没有文档的percentile_cont分析功能可用,但我必须做一些主要的黑客来让它做我想要的.
我希望能够用我收集到的正确的论点做同样的事情:
SELECT
state,
COUNT(county),
PERCENTILE_CONT(population2000,
0.5) OVER () AS `medPop`
FROM
CENSUS
GROUP BY
state;
Run Code Online (Sandbox Code Playgroud)
但是这个查询会产生错误
SELECT list expression references column population2000 which is neither grouped nor aggregated at
Run Code Online (Sandbox Code Playgroud)
我可以得到我想要的答案,但如果这是我想做的事情的推荐方式,我会非常失望:
SELECT
MAX(nCounties) AS nCounties,
state,
MAX(medPop) AS medPop
FROM (
SELECT
nCounties,
T1.state,
(PERCENTILE_CONT(population2000,
0.5) OVER (PARTITION BY T1.state)) AS `medPop`
FROM
census T1
LEFT OUTER JOIN (
SELECT
COUNT(county) AS `nCounties`,
state
FROM
census
GROUP BY
state) T2
ON
T1.state = T2.state) T3
GROUP BY
state
Run Code Online (Sandbox Code Playgroud)
有没有更好的方法来做我想做的事情?此外,该PERCENTILE_CONT功能是否会被记录?
谢谢阅读!
Min*_*ong 17
谢谢你的关注.PERCENTILE_CONT正在开发中,一旦它成为GA,我们将发布文档.我们首先支持它作为分析函数,并且我们计划稍后将其作为聚合函数(允许GROUP BY)支持它.在这两个版本之间,可以采用更简单的解决方法
SELECT
state,
ANY_VALUE(nCounties) AS nCounties,
ANY_VALUE(medPop) AS medPop
FROM (
SELECT
state,
COUNT(county) OVER (PARTITION BY state) AS nCounties,
PERCENTILE_CONT(population2000,
0.5) OVER (PARTITION BY state) AS medPop
FROM
CENSUS)
GROUP BY
state
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
6187 次 |
| 最近记录: |