BigQuery与Google Analytics报告中的总会话数

Wil*_*uks 9 google-analytics google-bigquery

我只是在学习BigQuery,所以这可能是一个愚蠢的问题,但我们想在那里得到一些统计数据,其中一个是给定日期内的总会话数.

为此,我在BQ中查询过:

select sum(sessions) as total_sessions from (
  select
    fullvisitorid,
    count(distinct visitid) as sessions,
    from (table_query([40663402], 'timestamp(right(table_id,8)) between timestamp("20150519") and timestamp("20150519")'))
    group each by fullvisitorid
)
Run Code Online (Sandbox Code Playgroud)

(我正在使用,table_query因为稍后我们可能会增加天数)

这导致了1,075,137.

但在我们的Google Analytics报告中,在"受众群体概述"部分中,当天的结果如下:

This report is based on 1,026,641 sessions (100% of sessions).

尽管有一天,但总有大约5%的差异.所以我想知道,即使查询很简单,我们还有什么错误吗?

这种差异预计会发生吗?我阅读了BigQuery的文档,但在这个问题上找不到任何东西.

提前致谢,

Mar*_*ann 12

简单地SUM(totals.visits)或在使用时COUNT(DISTINCT CONCAT(fullVisitorId, CAST(visitStartTime AS STRING) ))确保totals.visits=1!

如果你使用visitId而且你没有每天分组,你将结合午夜分裂会话!

以下是所有方案:

SELECT
  COUNT(DISTINCT CONCAT(fullVisitorId, CAST(visitStartTime AS STRING) )) allSessionsUniquePerDay,
  COUNT(DISTINCT CONCAT(fullVisitorId, CAST(visitId AS STRING) )) allSessionsUniquePerSelectedTimeframe,
  sum(totals.visits) interactiveSessionsUniquePerDay, -- equals GA UI sessions
  COUNT(DISTINCT IF(totals.visits=1, CONCAT(fullVisitorId, CAST(visitId AS STRING)), NULL) ) interactiveSessionsUniquePerSelectedTimeframe,
  SUM(IF(totals.visits=1,0,1)) nonInteractiveSessions
FROM
  `project.dataset.ga_sessions_2017102*`
Run Code Online (Sandbox Code Playgroud)

包起来:

  • fullVisitorId+ visitId:用于重新连接midnight-splits
  • fullVisitorId+ visitStartTime:考虑拆分很有用
  • totals.visits=1 用于交互会话
  • fullVisitorId+ visitStartTimewhere totals.visits=1:GA UI会话(如果您需要会话ID)
  • SUM(totals.visits):简单的GA UI会话
  • fullVisitorId+ visitIdwhere totals.visits=1GROUP BY date:GA UI会话有很多错误和误解的机会


Wil*_*uks 9

在发布问题后,我们与Google支持人员联系,发现在Google Analytics中,只会触发有"事件"被触发的会话.

在Bigquery中,无论是否进行了交互,您都可以找到所有会话.

为了找到与GA相同的结果,您应该totals.visits = 1在BQ查询中按会话进行过滤(totals.visits仅对于触发事件的会话为1).

那是:

select sum(sessions) as total_sessions from (
  select
    fullvisitorid,
    count(distinct visitid) as sessions,
    from (table_query([40663402], 'timestamp(right(table_id,8)) between timestamp("20150519") and timestamp("20150519")'))
    where totals.visits = 1
    group each by fullvisitorid
)
Run Code Online (Sandbox Code Playgroud)