如何生成日期系列以占用Google BiqQuery中缺少的日期?

Man*_*mal 4 sql google-bigquery

我想从谷歌大查询表中获取每日销售总额.我使用了以下代码.

select Day(InvoiceDate) date, Sum(InvoiceAmount) sales from test_gmail_com.sales 
where year(InvoiceDate) = Year(current_date()) and
Month(InvoiceDate) = Month(current_date())
group by date order by date
Run Code Online (Sandbox Code Playgroud)

从上面的查询中,它只给出了表中每日销售额的总和.有些日子有可能没有任何销售.对于那种情况,我需要得到日期和总和应该为0.例如,在每个月应该30 0r 31行与销售额之和.示例如下所示.本月的第4天没有销售.所以它的总和应该是0.

date | sales
-----+------
1    |   259
-----+------
2    |   359
-----+------
3    |   45
-----+------
4    |    0
-----+------
5    |  156
Run Code Online (Sandbox Code Playgroud)

是否可以在Big-query中进行?基本上日期列应该是1 - 28/29/30或31st的系列,具体取决于一年中的月份

小智 41

生成日期列表,然后加入您需要的任何表格似乎是最简单的。我用了generate_date_array+ unnest,看起来很干净。

生成天数列表(每行一天):

  SELECT
  *
  FROM 
    UNNEST(GENERATE_DATE_ARRAY('2018-10-01', '2020-09-30', INTERVAL 1 DAY)) AS example
Run Code Online (Sandbox Code Playgroud)


Mik*_*ant 11

您可以使用下面的代码在给定范围内生成所有日期(在下面的示例中,它是从2015-06-01到CURRENT_DATE()的所有日期 - 通过更改那些您可以控制生成的日期范围)

SELECT DATE(DATE_ADD(TIMESTAMP("2015-06-01"), pos - 1, "DAY")) AS calendar_day
FROM (
     SELECT ROW_NUMBER() OVER() AS pos, *
     FROM (FLATTEN((
     SELECT SPLIT(RPAD('', 1 + DATEDIFF(TIMESTAMP(CURRENT_DATE()), TIMESTAMP("2015-06-01")), '.'),'') AS h
     FROM (SELECT NULL)),h
)))
Run Code Online (Sandbox Code Playgroud)

所以,现在 - 您可以将LEFT JOIN与您的表一起使用,以便记录所有日期.见下面的潜在例子

SELECT
  calendar_day,
  IFNULL(sales, 0) AS sales
FROM (
  SELECT DATE(DATE_ADD(TIMESTAMP("2015-06-01"), pos - 1, "DAY")) AS calendar_day
  FROM (
       SELECT ROW_NUMBER() OVER() AS pos, *
       FROM (FLATTEN((
       SELECT SPLIT(RPAD('', 1 + DATEDIFF(TIMESTAMP(CURRENT_DATE()), TIMESTAMP("2015-06-01")), '.'),'') AS h
       FROM (SELECT NULL)),h
  )))
) AS all_dates
LEFT JOIN (
  SELECT DAY(InvoiceDate) DATE, SUM(InvoiceAmount) sales 
  FROM test_gmail_com.sales 
  WHERE YEAR(InvoiceDate) = YEAR(CURRENT_DATE()) AND
  MONTH(InvoiceDate) = MONTH(CURRENT_DATE())
  GROUP BY DATE 
)
ON DATE = calendar_day  
Run Code Online (Sandbox Code Playgroud)

我想要前几个月的销量

下面给出了上个月的所有日子

SELECT DATE(DATE_ADD(DATE_ADD(DATE_ADD(CURRENT_DATE(), -1, "MONTH"), 1 - DAY(CURRENT_DATE()), "DAY"), pos - 1, "DAY")) AS calendar_day
FROM (
     SELECT ROW_NUMBER() OVER() AS pos, *
     FROM (FLATTEN((
     SELECT SPLIT(RPAD('', 1 + DATEDIFF(DATE_ADD(CURRENT_DATE(), - DAY(CURRENT_DATE()), "DAY"), DATE_ADD(DATE_ADD(CURRENT_DATE(), -1, "MONTH"), 1 - DAY(CURRENT_DATE()), "DAY")), '.'),'') AS h
     FROM (SELECT NULL)),h
)))
Run Code Online (Sandbox Code Playgroud)

  • 上述解决方案适用于 bigquery legacy。根据 /sf/ask/2681421151/ ,对于最新版本,以下查询有效 SELECT从 UNNEST( GENERATE_DATE_ARRAY(DATE('2015-06-01'), CURRENT_DATE(), INTERVAL 1 DAY) ) AS 天 (2认同)