如何计算Postgres中行之间的累积差异?

Wil*_*m R 7 postgresql postgresql-9.3

我需要根据 session_id 计算 Postgres 中前一天累积值的差异。示例如下:

CREATE TEMP TABLE foo AS
SELECT date::date, session_id, upload_usage, download_usage, total_usage_on_a_day
FROM ( VALUES
  ( '10/21/2014', '0007994b', 37578561   , 6800209   , 44378770 ),
  ( '10/22/2014', '0007994b', 218113296  , 85272007  , 303385303 ),
  ( '10/23/2014', '0007994b', 552228616  , 252390680 , 804619296 ) ,
  ( '10/24/2014', '0007994b', 799772020  , 391196041 , 1190968061 ),
  ( '10/25/2014', '0007994b', 1047233978 , 529908804 , 1577142782 ),
  ( '10/26/2014', '0007994b', 1294608258 , 668515778 , 1963124036 ),
  ( '10/27/2014', '0007994b', 1066656794 , 557318645 , 2573613674 ),
  ( '10/27/2014', '00079e4e', 12949219   , 7265243   , 20214462 ),
  ( '10/28/2014', '00079e4e', 203871297  , 114308478 , 318179775 ),
  ( '10/29/2014', '00079e4e', 445466682  , 251486943 , 696953625 ),
  ( '10/30/2014', '00079e4e', 183499477  , 109643736 , 893143213 )
) AS t( date, session_id, upload_usage, download_usage, total_usage_on_a_day );
Run Code Online (Sandbox Code Playgroud)

预期成绩:

Date        session_id  upload_usage    download_usage  total_usage_on_a_day    Extected_difference
10/21/2014  0007994b    37578561        6800209         44378770                44378770
10/22/2014  0007994b    218113296       85272007        303385303               259006533
10/23/2014  0007994b    552228616       252390680       804619296               501233993
10/24/2014  0007994b    799772020       391196041       1190968061              386348765
10/25/2014  0007994b    1047233978      529908804       1577142782              386174721
10/26/2014  0007994b    1294608258      668515778       1963124036              385981254
10/27/2014  0007994b    1066656794      557318645       2573613674              610489638
10/27/2014  00079e4e    12949219        7265243         20214462                20214462
10/28/2014  00079e4e    203871297       114308478       318179775               297965313
10/29/2014  00079e4e    445466682       251486943       696953625               378773850
10/30/2014  00079e4e    183499477       109643736       893143213               196189588
Run Code Online (Sandbox Code Playgroud)

基本上,我必须从total_usage_on_a_day特定会话中计算每天的使用量。

我不擅长窗口函数。我试图找到日差但如何找到数据级别差异?

select date,sesion_id, SUM(upload_usage)AS UPLOAD,
SUM(download_usage)AS DOWNLOAD,
max(total_usage_on_a_day)AS TOTAL_AS_CUMM,
lag(daTE) over (PARTITION BY sesion_id ORDER BY daTE ASC) as previous_id,
lead(daTE) over (PARTITION BY sesion_id ORDER BY daTE ASC) as present_id
from jiodba.s_crc_zda_mon_conn_usage where GPART= '1100043958' AND zzaccess_ntwk_id = 'FTTH'
GROUP BY sesion_id,date
ORDER BY 1 limit 100;
Run Code Online (Sandbox Code Playgroud)

ype*_*eᵀᴹ 6

您可以像任何其他查询一样LAG()在查询中使用函数GROUP BY。唯一的区别是,在窗口(允许的列OVER),在LAG是允许的那些SELECT后一个GROUP BY

select 
    date,
    session_id, 
    sum(upload_usage) as upload,
    sum(download_usage) as download,
    sum(total_usage_on_a_day) as total_as_cumm,
    sum(total_usage_on_a_day) 
    - coalesce(lag(sum(total_usage_on_a_day)) over (partition by session_id order by date), 0)
        as expected_difference
 from jiodba.s_crc_zda_mon_conn_usage 
 where gpart = '1100043958' 
   and zzaccess_ntwk_id = 'FTTH' 
 group by session_id, date
 order by session_id, date 
 limit 100 ;
Run Code Online (Sandbox Code Playgroud)
  • 我不确定你为什么max(usage)在命名 column 时使用total_usage。我改为使用它sum()。如果您需要在max(usage)那里使用,请为该列选择一个更合适的名称。