同一列中日期的总和间隔

C S*_*ith 10 postgresql datetime sum

您如何最好地总结交错行之间同一列中一系列日期的差异?我有一个日期时间列,想计算行之间的差异。我想要秒的差异。这个问题不是关于如何获得 2 个时间戳之间的差异,而是更侧重于如何最有效地在同一个表的行之间进行计算。在我的情况下,每一行都有一个日期时间事件类型,它在逻辑上将两行链接在一起。

详细说明如何对开始和结束的事件类型进行分组。(Andriy M 的问题)开始和结束“应该”是连续的。如果开始没有后续结束,则应将其排除在总和之外。移动到下一个开始,看看它是否有结束。只有连续的开始 - 结束对应该添加到总秒数的总和中。

在 postgresql 9.x 中工作...

表中的示例数据;

eventtype, eventdate
START, 2015-01-01 14:00
END, 2015-01-01 14:25
START, 2015-01-01 14:30
END, 2015-01-01 14:43
START, 2015-01-01 14:45
END, 2015-01-01 14:49
START, 2015-01-01 14:52
END, 2015-01-01 14:55
Run Code Online (Sandbox Code Playgroud)

请注意,所有开始日期和结束日期都是连续的。

这是我的第一次尝试。似乎正在工作。

SELECT 
-- starts.*
SUM(EXTRACT(EPOCH FROM (eventdate_next - eventdate))) AS duration_seconds
FROM
( 
    WITH x AS (
        SELECT *, dense_rank() OVER (ORDER BY eventdate) AS rnk
        FROM   table
        AND eventdate > '2015-01-01 00:00:00.00'
        AND eventdate < '2016-01-01 23:59:59.59' 
        )
    SELECT x.eventdate, x.eventtype, y.eventdate AS eventdate_next,  y.eventtype AS eventtype_next
    FROM   x
    LEFT   JOIN (SELECT DISTINCT eventdate, eventtype, rnk FROM x) y ON y.rnk = (x.rnk + 1)
    ORDER  BY x.eventdate
) starts
WHERE
eventtype = 'START'   
GROUP BY eventtype 
Run Code Online (Sandbox Code Playgroud)

我的第一次尝试是基于 stackoverflow Postgres 9.1 的一个很好的例子 - 获取下一个值

笔记; 您可以评论 GROUP BY 和 SUM 并取消评论开始。* 以获得进入总和的每个单独持续时间的记录。

And*_*y M 10

您可以使用LEAD分析函数获取下一行eventtypeeventdate当前行旁边的数据:

SELECT
  eventtype,
  eventdate,
  LEAD(eventtype) OVER (ORDER BY eventdate) AS nexttype,
  LEAD(eventdate) OVER (ORDER BY eventdate) AS nextdate
FROM
  atable
WHERE
      eventdate >= '2015-01-01 00:00:00.00'
  AND eventdate <  '2016-01-01 23:59:59.59'
Run Code Online (Sandbox Code Playgroud)

使用上述查询作为派生表,您可以进一步过滤输出eventtype = 'START' AND nexttype = 'END'并获得差异总数:

SELECT
  SUM(EXTRACT(EPOCH FROM (nextdate - eventdate))) AS duration_seconds
FROM
  (
    SELECT
      eventtype,
      eventdate,
      LEAD(eventtype) OVER (ORDER BY eventdate) AS nexttype,
      LEAD(eventdate) OVER (ORDER BY eventdate) AS nextdate
    FROM
      atable
    WHERE
          eventdate >= '2015-01-01 00:00:00.00'
      AND eventdate <  '2016-01-01 23:59:59.59'
  ) AS s
WHERE
      eventtype = 'START'
  AND nexttype  = 'END'
;
Run Code Online (Sandbox Code Playgroud)

作为一个细微的变化,您可以将子查询实现为 CTE:

WITH cte AS
  (
    SELECT
      eventtype,
      eventdate,
      LEAD(eventtype) OVER (ORDER BY eventdate) AS nexttype,
      LEAD(eventdate) OVER (ORDER BY eventdate) AS nextdate
    FROM
      atable
    WHERE
          eventdate >= '2015-01-01 00:00:00.00'
      AND eventdate <  '2016-01-01 23:59:59.59'
  )
SELECT
  SUM(EXTRACT(EPOCH FROM (nextdate - eventdate))) AS duration_seconds
FROM
  cte
WHERE
      eventtype = 'START'
  AND nexttype  = 'END'
;
Run Code Online (Sandbox Code Playgroud)

这种重写可能会对性能产生影响,因为与派生表不同,CTE 在 PostgreSQL 中具体化。测试应该揭示是否存在差异,如果存在差异,哪个选项更适合您。