PostgreSQL Group By Sum

bob*_*sie 7 sql postgresql

我一直在为PostgreSQL中的这个问题摸不着头脑.我有一个test有两列的表: - idcontent.例如

create table test (id integer, 
                   content varchar(1024));

insert into test (id, content) values 
    (1, 'Lorem Ipsum is simply dummy text of the printing and typesetting industry.'),
    (2, 'Lorem Ipsum has been the industrys standard dummy text '),
    (3, 'ever since the 1500s, when an unknown printer took a galley of type and scrambled it to'),
    (4, 'make a type specimen book.'),
    (5, 'It has survived not only five centuries, but also the leap into electronic typesetting, remaining essentially unchanged.'),
    (6, 'It was popularised in the 1960s with the release of Letraset sheets containing Lorem '),
    (7, 'Ipsum passages, and more recently with desktop publishing software like Aldus PageMaker'),
    (8, ' including versions of Lorem Ipsum.');
Run Code Online (Sandbox Code Playgroud)

如果我运行以下查询...

select id, length(content) as characters from test order by id
Run Code Online (Sandbox Code Playgroud)

......然后我得到: -

id | characters
---+-----------
 1 |         74
 2 |         55
 3 |         87
 4 |         26
 5 |        120
 6 |         85
 7 |         87
 8 |         35
Run Code Online (Sandbox Code Playgroud)

我想要做的是id将内容的总和超过阈值的行分组.例如,如果该阈值是,100那么期望的结果将如下所示: -

ids | characters
----+-----------   
1,2 |        129
3,4 |        113    
5   |        120
6,7 |        172    
8   |         35 
Run Code Online (Sandbox Code Playgroud)

注意(1): - 查询不需要生成一个characters列 - 只是ids- 他们在这里传达他们已经全部100- 除了最后一行35.

注(2): - ids可以是逗号分隔的字符串或PostgreSQL数组 - 类型不如值重要

我可以使用窗口函数来执行此操作,还是需要更复杂的东西lateral join

Gor*_*off 5

这类问题需要递归CTE(或类似功能).这是一个例子:

with recursive t as (
      select id, length(content) as len,
             row_number() over (order by id) as seqnum
      from test 
     ),
     cte(id, len, ids, seqnum, grp) as (
      select id, len, len as cumelen, t.id::text, 1::int as seqnum, 1 as grp
      from t
      where seqnum = 1
      union all
      select t.id,
             t.len,
             (case when cte.cumelen >= 100 then t.len else cte.cumelen + t.len end) as cumelen,
             (case when cte.cumelen >= 100 then t.id::text else cte.ids || ',' || t.id::text end) as ids,
             t.seqnum
             (case when cte.cumelen >= 100 then cte.grp + 1 else cte.grp end) as ids,
      from t join
           cte
           on cte.seqnum = t.seqnum - 1
     )
select grp, max(ids)
from cte
group by grp;
Run Code Online (Sandbox Code Playgroud)

这是一个小工作示例:

with recursive test as (
      select 1 as id, 'abcd'::text as content union all
      select 2 as id, 'abcd'::text as content union all
      select 3 as id, 'abcd'::text as content 
     ),
     t as (
      select id, length(content) as len,
             row_number() over (order by id) as seqnum
      from test 
     ),
     cte(id, len, cumelen, ids, seqnum, grp) as (
      select id, len, len as cumelen, t.id::text, 1::int as seqnum, 1 as grp
      from t
      where seqnum = 1
      union all
      select t.id,
             t.len,
             (case when cte.cumelen >= 5 then t.len else cte.cumelen + t.len end) as cumelen,
             (case when cte.cumelen >= 5 then t.id::text else cte.ids || ',' || t.id::text end) as ids,
             t.seqnum::int,
             (case when cte.cumelen >= 5 then cte.grp + 1 else cte.grp end)
      from t join
           cte
           on cte.seqnum = t.seqnum - 1
     )
select grp, max(ids)
from cte
group by grp;
Run Code Online (Sandbox Code Playgroud)


Abe*_*sto 2

使用存储函数可以避免(有时)令人头疼的查询。

\n\n
create or replace function fn_foo(ids out int[], characters out int) returns setof record language plpgsql as $$\ndeclare\n  r record;\n  threshold int := 100;\nbegin\n  ids := '{}'; characters := 0;\n  for r in (\n    select id, coalesce(length(content),0) as lng\n    from test order by id)\n  loop\n    characters := characters + r.lng;\n    ids := ids || r.id;\n    if characters > threshold then\n      return next;\n      ids := '{}'; characters := 0;\n    end if;\n  end loop;\n  if ids <> '{}' then\n    return next;\n  end if;\nend $$;\n\nselect * from fn_foo();\n\n\xe2\x95\x94\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\xa4\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x97\n\xe2\x95\x91  ids  \xe2\x94\x82 characters \xe2\x95\x91\n\xe2\x95\xa0\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\xaa\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\xa3\n\xe2\x95\x91 {1,2} \xe2\x94\x82        129 \xe2\x95\x91\n\xe2\x95\x91 {3,4} \xe2\x94\x82        113 \xe2\x95\x91\n\xe2\x95\x91 {5}   \xe2\x94\x82        120 \xe2\x95\x91\n\xe2\x95\x91 {6,7} \xe2\x94\x82        172 \xe2\x95\x91\n\xe2\x95\x91 {8}   \xe2\x94\x82         35 \xe2\x95\x91\n\xe2\x95\x9a\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\xa7\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x90\xe2\x95\x9d\n(5 rows)\n
Run Code Online (Sandbox Code Playgroud)\n