Ali*_*Deg 7 sql google-bigquery
不幸的是,在BQ中重塑它并不像在R中那么容易,我无法导出这个项目的数据.
这是输入
date country A B C D
20170928 CH 3000.3 121 13 3200
20170929 CH 2800.31 137 23 1614.31
Run Code Online (Sandbox Code Playgroud)
预期产出
date country Metric Value
20170928 CH A 3000.3
20170928 CH B 121
20170928 CH C 13
20170928 CH D 3200
20170929 CH A 2800.31
20170929 CH B 137
20170929 CH C 23
20170929 CH D 1614.31
Run Code Online (Sandbox Code Playgroud)
我的表还有更多的列和行(但我假设需要很多手册)
以下是适用于BigQuery标准SQL的内容,不需要重复选择,具体取决于列数。它会选择您所拥有的数量并将其转换为指标和值
#standardSQL
SELECT DATE, country,
metric, SAFE_CAST(value AS FLOAT64) value
FROM (
SELECT DATE, country,
REGEXP_REPLACE(SPLIT(pair, ':')[OFFSET(0)], r'^"|"$', '') metric,
REGEXP_REPLACE(SPLIT(pair, ':')[OFFSET(1)], r'^"|"$', '') value
FROM `project.dataset.yourtable` t,
UNNEST(SPLIT(REGEXP_REPLACE(to_json_string(t), r'{|}', ''))) pair
)
WHERE NOT LOWER(metric) IN ('date', 'country')
Run Code Online (Sandbox Code Playgroud)
您可以使用虚拟数据在上面的问题中进行测试/操作
#standardSQL
WITH `project.dataset.yourtable` AS (
SELECT '20170928' DATE, 'CH' country, 3000.3 A, 121 B, 13 C, 3200 D UNION ALL
SELECT '20170929', 'CH', 2800.31, 137, 23, 1614.31
)
SELECT DATE, country,
metric, SAFE_CAST(value AS FLOAT64) value
FROM (
SELECT DATE, country,
REGEXP_REPLACE(SPLIT(pair, ':')[OFFSET(0)], r'^"|"$', '') metric,
REGEXP_REPLACE(SPLIT(pair, ':')[OFFSET(1)], r'^"|"$', '') value
FROM `project.dataset.yourtable` t,
UNNEST(SPLIT(REGEXP_REPLACE(to_json_string(t), r'{|}', ''))) pair
)
WHERE NOT LOWER(metric) IN ('date', 'country')
Run Code Online (Sandbox Code Playgroud)
结果如预期
DATE country metric value
20170928 CH A 3000.3
20170928 CH B 121.0
20170928 CH C 13.0
20170928 CH D 3200.0
20170929 CH A 2800.31
20170929 CH B 137.0
20170929 CH C 23.0
20170929 CH D 1614.31
Run Code Online (Sandbox Code Playgroud)