719*_*016 -1 sql group-by google-bigquery
我在 BigQuery 中有一个包含年龄和性别字段的表,我可以像这样分组:
bq query --max_rows=9999 --format=csv --use_legacy_sql=false 'SELECT COUNT(*) AS COUNT, age, sex FROM `project.dataset.table` GROUP BY age, sex ORDER BY age, sex' 2>/dev/null | head -n 11 | csvtk pretty
COUNT age sex
143 50.0 Female
77 50.0 Male
28 51.0 Female
78 51.0 Male
30 52.0 Female
22 52.0 Male
79 53.0 Female
81 53.0 Male
111 54.0 Female
[...]
Run Code Online (Sandbox Code Playgroud)
我想按特定年龄范围分组:50-59、60-69、60-79 和 80 岁或以上。
如何转换上面的查询,以便我可以按特定年龄范围进行分组?
此外,稍微复杂一点的是,我的sex领域可以是F, Female或M, Male。分组时如何将两种类型合二为一?
编辑:我在想输出可能是这样的:
COUNT,agegroup,sex
10,50-59,Female
[...]
Run Code Online (Sandbox Code Playgroud)
您可以使用一个case表达式:
SELECT (CASE WHEN age >= 50 and age < 60 then '50-59'
WHEN age < 70 then '60-69'
WHEN age < 80 THEN '70-79'
WHEN age >= 80 THEN '80+'
END) as agegrp, sex,
COUNT(*) as nt
FROM `project.dataset.table`
GROUP BY agegrp, sex
ORDER BY sex, MIN(age);
Run Code Online (Sandbox Code Playgroud)
如果您需要按F, Femaleand分组并且M, Male需要将 age 转换为 FLOAT64,则完整答案将是:
SELECT (CASE WHEN CAST(age AS FLOAT64) >= 50 and CAST(age AS FLOAT64) < 60 then "50-59"
WHEN CAST(age AS FLOAT64) < 70 then "60-60"
WHEN CAST(age AS FLOAT64) < 80 THEN "70-79"
WHEN CAST(age AS FLOAT64) >= 80 THEN "80+" END) as agegrp,
(CASE WHEN sex IN ("F","Female") then "F"
WHEN sex IN ("M","Male") then "M" END) AS sexgrp,
COUNT(*) as nt
FROM `project.dataset.table`
GROUP BY agegrp, sexgrp ORDER BY sexgrp, MIN(CAST(age AS FLOAT64))
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
1078 次 |
| 最近记录: |