如何使用 BigQuery 创建渠道路径?

gvk*_*eef 2 google-analytics google-bigquery

我想根据用户访问网站的顺序为每个用户 ID 创建一个通道路径,我想对每个路径的总交易进行求和。这个想法是用 Bigquery 做到这一点。

我有以下输入表:

           user id - date       - hits.time - channelgrouping - transaction
           xxxxxxx - 2017-01-01 - 23234     - google cpc      - 1          
           xxxxxxx - 2017-01-02 - 23234     - email           - 0           
Run Code Online (Sandbox Code Playgroud)

我想要的输出表是:

           user id - channelgrouping path - transaction
           xxxxxxx - google cpc > email   - 1
Run Code Online (Sandbox Code Playgroud)

任何人都可以通过提供创建路径的代码来帮助我入门吗?

提前致谢!

Mik*_*ant 5

请参阅下面的示例和方向

#standardSQL
WITH yourTable AS (
  SELECT 1 AS user_id, '2017-01-01' AS DATE, 'google cpc' AS channelgrouping, 1 AS transaction UNION ALL
  SELECT 1, '2017-01-02', 'email', 0 UNION ALL
  SELECT 2, '2017-01-01', 'abc', 2 UNION ALL
  SELECT 2, '2017-01-02', 'xyz', 3 
)
SELECT 
  user_id, 
  STRING_AGG(channelgrouping, ' > ') AS channelgrouping_path,
  SUM(transaction) AS transaction
FROM yourTable
GROUP BY user_id
-- ORDER BY USER_ID  
Run Code Online (Sandbox Code Playgroud)

输出如下

user_id channelgrouping_path    transaction  
1       google cpc > email      1    
2       abc > xyz               5    
Run Code Online (Sandbox Code Playgroud)

基于您的确切查询的示例:

#standardSQL
SELECT
  visitorId
  ,STRING_AGG(channelgrouping, ' > ') AS channelgrouping_path
  ,SUM(transactions) AS transaction
FROM (
  SELECT 
    date
    ,visitorId
    ,channelgrouping
    ,SUM(totals.transactions) AS transactions
  FROM `project.dataset.table`
  GROUP BY
    date
    ,visitorId
    ,channelGrouping
)
GROUP BY visitorId  
Run Code Online (Sandbox Code Playgroud)

确保您替换project.dataset.table为各自的

我将不得不按日期和 hits.time 对数据集进行排序,这对于执行来说非常繁重。

查看如何使用聚合字符串控制顺序的示例

STRING_AGG(channelgrouping, ' > ' ORDER BY date) AS channelgrouping_path
Run Code Online (Sandbox Code Playgroud)