Edg*_*aya 1 sql google-bigquery
我实际上不知道如何执行这样的查询.我的Google BigQuery中有2个表格:
第一张桌子(印象):
+-----------+--------+------------+-------+
| Timestamp | UserID | Event_Type | Count |
+-----------+--------+------------+-------+
| 100 | 111 | impression | 2 |
| 105 | 111 | impression | 1 |
| 110 | 111 | impression | 1 |
| 120 | 111 | impression | 2 |
| 100 | 222 | impression | 1 |
| 105 | 222 | impression | 1 |
| 110 | 222 | impression | 1 |
| 120 | 222 | impression | 1 |
+-----------+--------+------------+-------+
Run Code Online (Sandbox Code Playgroud)
第二个表(转换):
+-----------+--------+------------+-------+
| Timestamp | UserID | Event_Type | Count |
+-----------+--------+------------+-------+
| 115 | 111 | conversion | 1 |
| 117 | 222 | conversion | 1 |
+-----------+--------+------------+-------+
Run Code Online (Sandbox Code Playgroud)
我想得到的 - 转化所需的每位用户的展示次数,因此我要计算转化前发生的所有展示次数(按时间戳 - 实际上是unix格式).
+--------+--------------------+
| UserID | Impressions Needed |
+--------+--------------------+
| 111 | 4 |
| 222 | 3 |
+--------+--------------------+
Run Code Online (Sandbox Code Playgroud)
我可以通过UserID加入这些表并获得Impression和Conversions的总数,我可以将它们联合起来并按UserID和Timestamp排序,但我不知道如何得到最终答案,所以很遗憾我没有什么可以在这里显示的.我希望有办法做到这一点,这里有人可以帮助我.
答案是(标准SQL):
SELECT t2.User_ID, COUNT(t1.User_ID) as ImpressionsNeeded
FROM
(
SELECT MIN(Event_Time) as Event_Time, User_ID, Advertiser_ID, Campaign_ID, count(*) AS Conv_Count
FROM ``db.dcm_account111111.activity_111111_*``
WHERE _TABLE_SUFFIX BETWEEN '20170101' AND '20170110' AND Advertiser_ID = '888888' AND Campaign_ID = '888888' AND Event_Sub_Type = 'POSTCLICK'
GROUP BY User_ID, Advertiser_ID, Campaign_ID
) as t2
LEFT JOIN
(
SELECT Event_Time, User_ID, Advertiser_ID, Campaign_ID, count(*) AS Imps_Count
FROM ``db.dcm_account111111.impression_111111_*``
WHERE _TABLE_SUFFIX BETWEEN '20170101' AND '20170110' AND Advertiser_ID = '888888' AND Campaign_ID = '888888'
GROUP BY Event_Time, User_ID, Advertiser_ID, Campaign_ID
) as t1
ON t1.User_ID = t2.User_ID AND t1.Advertiser_ID = t2.Advertiser_ID AND t1.Campaign_ID = t2.campaign_ID AND t1.Event_Time < t2.Event_Time
GROUP BY t2.User_ID
ORDER BY ImpressionsNeeded DESC
这听起来像一个left join聚合:
select t2.userid, count(t1.userid)
from table2 t2 left join
table1 t1
on t1.userid = t2.userid and
t1.event_type = 'impression' and
t1.timestamp < t2.timestamp
group by t2.userid;
Run Code Online (Sandbox Code Playgroud)