Tho*_*ain 1 transpose google-bigquery
早上好,
我正在尝试在大查询中转置一些数据。我看过其他一些人在 stackoverflow 上问过这个问题,但这样做的方法似乎是使用旧版 sql(使用 group_concat_unquoted)而不是标准 sql。我会使用旧版本,但过去我遇到过嵌套数据问题,因此此后仅使用标准数据。
这是我的例子,为了提供一些背景信息,我试图绘制出一些客户旅程,如下所示:
uniqueid | page_flag | order_of_pages
A | Collection| 1
A | Product | 2
A | Product | 3
A | Login | 4
A | Delivery | 5
B | Clearance | 1
B | Search | 2
B | Product | 3
C | Search | 1
C | Collection| 2
C | Product | 3
Run Code Online (Sandbox Code Playgroud)
不过我想转置数据,使其看起来像这样:
uniqueid | 1 | 2 | 3 | 4 | 5
A | Collection | Product | Product | Login | Delivery
B | Clearance | Search | Product | NULL | NULL
C | Search | Collection | Product | NULL | NULL
Run Code Online (Sandbox Code Playgroud)
我尝试使用多个左连接,但出现以下错误:
select a.uniqueid,
b.page_flag as page1,
c.page_flag as page2,
d.page_flag as page3,
e.page_flag as page4,
f.page_flag as page5
from
(select distinct uniqueid,
(case when uniqueid is not null then 1 end) as page_hit1,
(case when uniqueid is not null then 2 end) as page_hit2,
(case when uniqueid is not null then 3 end) as page_hit3,
(case when uniqueid is not null then 4 end) as page_hit4,
(case when uniqueid is not null then 5 end) as page_hit5
from `mytable`) a
LEFT JOIN (
SELECT *
from `mytable`) b on a.uniqueid = b.uniqueid
and a.page_hit1 = b.order_of_pages
LEFT JOIN (
SELECT *
from `mytable`) c on a.uniqueid = c.uniqueid
and a.page_hit2 = c.order_of_pages
LEFT JOIN (
SELECT *
from `mytable`) d on a.uniqueid = d.uniqueid
and a.page_hit3 = d.order_of_pages
LEFT JOIN (
SELECT *
from `mytable`) e on a.uniqueid = e.uniqueid
and a.page_hit4 = e.order_of_pages
LEFT JOIN (
SELECT *
from `mytable`) f on a.uniqueid = f.uniqueid
and a.page_hit5 = f.order_of_pages
Error: Query exceeded resource limits for tier 1. Tier 13 or higher required.
Run Code Online (Sandbox Code Playgroud)
我也研究过使用数组函数,但我以前从未使用过它,并且我不确定这是否只是为了相反的转置。任何建议都会很棒。
谢谢
对于 BigQuery 标准 SQL
#standardSQL
SELECT
uniqueid,
MAX(IF(order_of_pages = 1, page_flag, NULL)) AS p1,
MAX(IF(order_of_pages = 2, page_flag, NULL)) AS p2,
MAX(IF(order_of_pages = 3, page_flag, NULL)) AS p3,
MAX(IF(order_of_pages = 4, page_flag, NULL)) AS p4,
MAX(IF(order_of_pages = 5, page_flag, NULL)) AS p5
FROM `mytable`
GROUP BY uniqueid
Run Code Online (Sandbox Code Playgroud)
您可以使用问题中的以下虚拟数据进行播放/测试
#standardSQL
WITH `mytable` AS (
SELECT 'A' AS uniqueid, 'Collection' AS page_flag, 1 AS order_of_pages UNION ALL
SELECT 'A', 'Product', 2 UNION ALL
SELECT 'A', 'Product', 3 UNION ALL
SELECT 'A', 'Login', 4 UNION ALL
SELECT 'A', 'Delivery', 5 UNION ALL
SELECT 'B', 'Clearance', 1 UNION ALL
SELECT 'B', 'Search', 2 UNION ALL
SELECT 'B', 'Product', 3 UNION ALL
SELECT 'C', 'Search', 1 UNION ALL
SELECT 'C', 'Collection', 2 UNION ALL
SELECT 'C', 'Product', 3
)
SELECT
uniqueid,
MAX(IF(order_of_pages = 1, page_flag, NULL)) AS p1,
MAX(IF(order_of_pages = 2, page_flag, NULL)) AS p2,
MAX(IF(order_of_pages = 3, page_flag, NULL)) AS p3,
MAX(IF(order_of_pages = 4, page_flag, NULL)) AS p4,
MAX(IF(order_of_pages = 5, page_flag, NULL)) AS p5
FROM `mytable`
GROUP BY uniqueid
ORDER BY uniqueid
Run Code Online (Sandbox Code Playgroud)
结果是
uniqueid p1 p2 p3 p4 p5
A Collection Product Product Login Delivery
B Clearance Search Product null null
C Search Collection Product null null
Run Code Online (Sandbox Code Playgroud)
根据您的需求,您也可以考虑以下方法(尽管不是枢轴)
#standardSQL
SELECT uniqueid,
STRING_AGG(page_flag, '>' ORDER BY order_of_pages) AS journey
FROM `mytable`
GROUP BY uniqueid
ORDER BY uniqueid
Run Code Online (Sandbox Code Playgroud)
如果使用与上面相同的虚拟数据运行 - 结果是
uniqueid journey
A Collection>Product>Product>Login>Delivery
B Clearance>Search>Product
C Search>Collection>Product
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
11150 次 |
| 最近记录: |