2个表加入:转化前的展示次数计数

Edg*_*aya 1 sql google-bigquery

我实际上不知道如何执行这样的查询.我的Google BigQuery中有2个表格:

第一张桌子(印象):

+-----------+--------+------------+-------+
| Timestamp | UserID | Event_Type | Count |
+-----------+--------+------------+-------+
|       100 |    111 | impression |     2 |
|       105 |    111 | impression |     1 |
|       110 |    111 | impression |     1 |
|       120 |    111 | impression |     2 |
|       100 |    222 | impression |     1 |
|       105 |    222 | impression |     1 |
|       110 |    222 | impression |     1 |
|       120 |    222 | impression |     1 |
+-----------+--------+------------+-------+ 
Run Code Online (Sandbox Code Playgroud)

第二个表(转换):

+-----------+--------+------------+-------+
| Timestamp | UserID | Event_Type | Count |
+-----------+--------+------------+-------+
|       115 |    111 | conversion |     1 |
|       117 |    222 | conversion |     1 |
+-----------+--------+------------+-------+ 
Run Code Online (Sandbox Code Playgroud)

我想得到的 - 转化所需的每位用户的展示次数,因此我要计算转化前发生的所有展示次数(按时间戳 - 实际上是unix格式).

+--------+--------------------+
| UserID | Impressions Needed |
+--------+--------------------+
|    111 |                  4 |
|    222 |                  3 |
+--------+--------------------+
Run Code Online (Sandbox Code Playgroud)

我可以通过UserID加入这些表并获得Impression和Conversions的总数,我可以将它们联合起来并按UserID和Timestamp排序,但我不知道如何得到最终答案,所以很遗憾我没有什么可以在这里显示的.我希望有办法做到这一点,这里有人可以帮助我.

答案是(标准SQL):

SELECT t2.User_ID, COUNT(t1.User_ID) as ImpressionsNeeded FROM ( SELECT MIN(Event_Time) as Event_Time, User_ID, Advertiser_ID, Campaign_ID, count(*) AS Conv_Count FROM ``db.dcm_account111111.activity_111111_*`` WHERE _TABLE_SUFFIX BETWEEN '20170101' AND '20170110' AND Advertiser_ID = '888888' AND Campaign_ID = '888888' AND Event_Sub_Type = 'POSTCLICK' GROUP BY User_ID, Advertiser_ID, Campaign_ID ) as t2 LEFT JOIN ( SELECT Event_Time, User_ID, Advertiser_ID, Campaign_ID, count(*) AS Imps_Count FROM ``db.dcm_account111111.impression_111111_*`` WHERE _TABLE_SUFFIX BETWEEN '20170101' AND '20170110' AND Advertiser_ID = '888888' AND Campaign_ID = '888888' GROUP BY Event_Time, User_ID, Advertiser_ID, Campaign_ID ) as t1 ON t1.User_ID = t2.User_ID AND t1.Advertiser_ID = t2.Advertiser_ID AND t1.Campaign_ID = t2.campaign_ID AND t1.Event_Time < t2.Event_Time GROUP BY t2.User_ID ORDER BY ImpressionsNeeded DESC

Gor*_*off 6

这听起来像一个left join聚合:

select t2.userid, count(t1.userid)
from table2 t2 left join
     table1 t1
     on t1.userid = t2.userid and
        t1.event_type = 'impression' and
        t1.timestamp < t2.timestamp
group by t2.userid;
Run Code Online (Sandbox Code Playgroud)