mdi*_*ivk 9 bucket google-bigquery
这是我在 BigQuery 中对公共数据集的查询:
SELECT RANGE_BUCKET(reputation, [400000, 500000, 600000, 700000, 800000, 900000, 1000000, 1100000, 1200000]) AS reputation_group, COUNT(*) AS count
FROM `bigquery-public-data.stackoverflow.users`
Where reputation > 200000
GROUP BY 1
ORDER By 1
Run Code Online (Sandbox Code Playgroud)
结果如下:
我如何显示存储桶的范围,而不是将信誉组显示为整数:
0: [0-400000]
1: [400001-500000]
2: [500001-600000]
....
Run Code Online (Sandbox Code Playgroud)
非常感谢。
更新: 非常感谢米哈伊尔的回答,下面有一个小改动:
SELECT bucket,
FORMAT('%i - %i', IFNULL(ranges[SAFE_OFFSET(bucket - 1)] + 1, 0), ranges[SAFE_OFFSET(bucket)]) AS reputation_group,
COUNT(*) AS COUNT
FROM `bigquery-public-data.stackoverflow.users`,
UNNEST([STRUCT([200000, 400000, 500000, 600000, 700000, 800000, 900000, 1000000, 1100000, 1200000] AS ranges)]),
UNNEST([RANGE_BUCKET(reputation, ranges)]) bucket
WHERE reputation > 200000
GROUP BY 1, 2
ORDER BY bucket
Run Code Online (Sandbox Code Playgroud)
请注意,在 STRUCT 中添加了一个额外的项目 200000,这使得结果显示
200001 - 400000而不是0 - 400000
下面是 BigQuery 标准 SQL
#standardSQL
SELECT bucket,
FORMAT('%i - %i', IFNULL(ranges[SAFE_OFFSET(bucket - 1)] + 1, 0), ranges[SAFE_OFFSET(bucket)]) AS reputation_group,
COUNT(*) AS COUNT
FROM `bigquery-public-data.stackoverflow.users`,
UNNEST([STRUCT([400000, 500000, 600000, 700000, 800000, 900000, 1000000, 1100000, 1200000] AS ranges)]),
UNNEST([RANGE_BUCKET(reputation, ranges)]) bucket
WHERE reputation > 200000
GROUP BY 1, 2
ORDER BY bucket
Run Code Online (Sandbox Code Playgroud)
结果
Row bucket reputation_group COUNT
1 0 0 - 400000 198
2 1 400001 - 500000 23
3 2 500001 - 600000 13
4 3 600001 - 700000 12
5 4 700001 - 800000 4
6 5 800001 - 900000 5
7 6 900001 - 1000000 2
8 8 1100001 - 1200000 1
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
677 次 |
| 最近记录: |