生成最小和最大日期之间的日期范围 Athena presto SQL 序列错误

Dat*_*ice 5 sql presto amazon-athena trino

unnest我试图在 Presto SQL (Athena) 中使用类似于postgressequence的东西生成一系列日期generate_series

我的桌子看起来像

job_name | run_date     
A        | '2021-08-21'
A        | '2021-08-25' 
B        | '2021-08-07' 
B        | '2021-08-24' 


SELECT d.job_name, d.run_date
FROM (
     VALUES
        ('A', '2021-08-21'), ('A', '2021-08-25'),
        ('B', '2021-08-07'), ('B', '2021-08-24')
         ) d(job_name, run_date)
Run Code Online (Sandbox Code Playgroud)

我的目标是输出如下

job_name |   run_date
       A | 2021-08-21
       A | 2021-08-22
       A | 2021-08-23
       A | 2021-08-24
       A | 2021-08-25
       B | 2021-08-07
       B | 2021-08-08
       B | 2021-08-09
       B | 2021-08-10
       B | 2021-08-11
       B | 2021-08-12
       B | 2021-08-13
       B | 2021-08-14
       B | 2021-08-15
       B | 2021-08-16
       B | 2021-08-17
       B | 2021-08-18
       B | 2021-08-19
       B | 2021-08-20
       B | 2021-08-21
       B | 2021-08-22
       B | 2021-08-23
       B | 2021-08-24
Run Code Online (Sandbox Code Playgroud)

我尝试使用以下查询来实现此目的 - 但是在尝试取消嵌套日期序列时出现错误

SELECT t.job_name, d.dte
FROM (SELECT job_name
        ,    min(run_date) as mind
        ,    max(run_date) as maxd
        ,    SEQUENCE(min(run_date), max(run_date)) as date_arr
     FROM job_log_table t
     GROUP BY job_name
  )  jd
CROSS JOIN
    UNNEST(jd.date_arr) d(dte)
LEFT JOIN job_log_table t 
    ON t.job_name = jd.job_name
    AND t.latest_date = d.dte;
Run Code Online (Sandbox Code Playgroud)

这会产生以下错误:

[HY000][100071] [Simba][AthenaJDBC](100071) An error has been thrown from the AWS Athena client. [ErrorCategory:USER_ERROR, ErrorCode:SYNTAX_ERROR], Detail:SYNTAX_ERROR: line 5:14: Unexpected parameters (date, date) for function sequence. Expected: sequence(bigint, bigint, bigint) , sequence(bigint, bigint) , sequence(timestamp, timestamp, interval day to second) , sequence(timestamp, timestamp, interval year to month)
Run Code Online (Sandbox Code Playgroud)

这是 Athena 的 Presto SQL 风格的限制还是我在某个地方犯了一个小学生错误?

Gur*_*ron 8

您需要提供interval生成日期序列(在本例中interval '1' day):

WITH dataset AS (
  SELECT * 
  FROM 
    ( VALUES      
        ('A', DATE '2021-08-21'), ('A', DATE '2021-08-25'),
        ('B', DATE '2021-08-07'), ('B', DATE '2021-08-24')
    ) AS d (job_name, run_date)
) 

select job_name, sequence(min(run_date), max(run_date), interval '1' day) seq
from dataset
group by job_name
Run Code Online (Sandbox Code Playgroud)

输出:

工作名称 序列
A [2021-08-21 00:00:00.000, 2021-08-22 00:00:00.000, 2021-08-23 00:00:00.000, 2021-08-24 00:00:00.000, 2021-08-25 00:00:00.000]
[2021-08-07 00:00:00.000, 2021-08-08 00:00:00.000, 2021-08-09 00:00:00.000, 2021-08-10 00:00:00.000, 2021-08-11 00:00:00.000, 2021-08-12 00:00:00.000, 2021-08-13 00:00:00.000, 2021-08-14 00:00:00.000, 2021-08-15 00:00:00.000, 2021-08-16 00:00:00.000, 2021-08-17 00:00:00.000, 2021-08-18 00:00:00.000, 2021-08-19 00:00:00.000, 2021-08-20 00 :00:00.000, 2021-08-21 00:00:00.000, 2021-08-22 00:00:00.000, 2021-08-23 00:00:00.000, 2021-08-24 00:00:00.000]