Pri*_*nth 82 sql oracle aggregate-functions listagg
我试图LISTAGG在Oracle中使用该功能.我想只获得该列的不同值.有没有一种方法可以在不创建函数或过程的情况下获得不同的值?
col1 col2 Created_by 1 2 Smith 1 2 John 1 3 Ajay 1 4 Ram 1 5 Jack
我需要选择col1和LISTAGGcol2(不考虑第3列).当我这样做时,我得到这样的结果LISTAGG:[2,2,3,4,5]
我需要在这里删除重复的'2'; 我只需要col2对col1的不同值.
a_h*_*ame 66
你的意思是这样的:
select listagg(distinct the_column, ',') within group (order by the_column)
from the_table
Run Code Online (Sandbox Code Playgroud)
如果您需要更多列,那么您可能正在寻找以下内容:
select listagg(the_column, ',') within group (order by the_column)
from (
select distinct the_column
from the_table
) t
Run Code Online (Sandbox Code Playgroud)
ozm*_*ike 40
以下是解决问题的方法.
select
regexp_replace(
'2,2,2.1,3,3,3,3,4,4'
,'([^,]+)(,\1)*(,|$)', '\1\3')
from dual
Run Code Online (Sandbox Code Playgroud)
回报
2,2.1,3,4
答案(见下面的注释):
select col1,
regexp_replace(
listagg(
col2 , ',') within group (order by col2) -- sorted
,'([^,]+)(,\1)*(,|$)', '\1\3') )
from tableX
where rn = 1
group by col1;
Run Code Online (Sandbox Code Playgroud)
注意:以上内容适用于大多数情况 - 列表应该排序,您可能需要根据您的数据修剪所有尾随和前导空格.
如果你在> 20或大字符串大小的组中有很多项,你可能会遇到oracle字符串大小限制'字符串连接的结果太长'所以在每个组的成员上放一个最大数字.这只有在可以仅列出第一个成员的情况下才有效.如果你有很长的变量字符串,这可能不起作用.你将不得不进行实验.
select col1,
case
when count(col2) < 100 then
regexp_replace(
listagg(col2, ',') within group (order by col2)
,'([^,]+)(,\1)*(,|$)', '\1\3')
else
'Too many entries to list...'
end
from sometable
where rn = 1
group by col1;
Run Code Online (Sandbox Code Playgroud)
另一种解决方案(没那么简单),希望能够避免oracle的字符串大小限制-字符串大小限制为4000感谢这个职位在这里通过user3465996
select col1 ,
dbms_xmlgen.convert( -- HTML decode
dbms_lob.substr( -- limit size to 4000 chars
ltrim( -- remove leading commas
REGEXP_REPLACE(REPLACE(
REPLACE(
XMLAGG(
XMLELEMENT("A",col2 )
ORDER BY col2).getClobVal(),
'<A>',','),
'</A>',''),'([^,]+)(,\1)*(,|$)', '\1\3'),
','), -- remove leading XML commas ltrim
4000,1) -- limit to 4000 string size
, 1) -- HTML.decode
as col2
from sometable
where rn = 1
group by col1;
Run Code Online (Sandbox Code Playgroud)
一些测试用例 - 仅供参考
regexp_replace('2,2,2.1,3,3,4,4','([^,]+)(,\1)+', '\1')
-> 2.1,3,4 Fail
regexp_replace('2 ,2 ,2.1,3 ,3 ,4 ,4 ','([^,]+)(,\1)+', '\1')
-> 2 ,2.1,3,4 Success - fixed length items
Run Code Online (Sandbox Code Playgroud)
项目中包含的项目,例如.2,21
regexp_replace('2.1,1','([^,]+)(,\1)+', '\1')
-> 2.1 Fail
regexp_replace('2 ,2 ,2.1,1 ,3 ,4 ,4 ','(^|,)(.+)(,\2)+', '\1\2')
-> 2 ,2.1,1 ,3 ,4 -- success - NEW regex
regexp_replace('a,b,b,b,b,c','(^|,)(.+)(,\2)+', '\1\2')
-> a,b,b,c fail!
Run Code Online (Sandbox Code Playgroud)
v3 - 正则表达式感谢伊戈尔!适用于所有情况.
select
regexp_replace('2,2,2.1,3,3,4,4','([^,]+)(,\1)*(,|$)', '\1\3') ,
---> 2,2.1,3,4 works
regexp_replace('2.1,1','([^,]+)(,\1)*(,|$)', '\1\3'),
--> 2.1,1 works
regexp_replace('a,b,b,b,b,c','([^,]+)(,\1)*(,|$)', '\1\3')
---> a,b,c works
from dual
Run Code Online (Sandbox Code Playgroud)
Kem*_*rcı 10
你可以使用未记录的wm_concat功能.
select col1, wm_concat(distinct col2) col2_list
from tab1
group by col1;
Run Code Online (Sandbox Code Playgroud)
此函数返回clob列,如果您希望可以使用dbms_lob.substr将clob转换为varchar2.
小智 7
我通过首先对值进行分组,然后使用listagg进行另一个聚合来克服此问题.像这样的东西:
select a,b,listagg(c,',') within group(order by c) c, avg(d)
from (select a,b,c,avg(d)
from table
group by (a,b,c))
group by (a,b)
Run Code Online (Sandbox Code Playgroud)
只有一个完整的表访问权限,相对容易扩展到更复杂的查询
小智 6
如果目的是将此转换应用于多个列,我已经扩展了a_horse_with_no_name的解决方案:
SELECT * FROM
(SELECT LISTAGG(GRADE_LEVEL, ',') within group(order by GRADE_LEVEL) "Grade Levels" FROM (select distinct GRADE_LEVEL FROM Students) t) t1,
(SELECT LISTAGG(ENROLL_STATUS, ',') within group(order by ENROLL_STATUS) "Enrollment Status" FROM (select distinct ENROLL_STATUS FROM Students) t) t2,
(SELECT LISTAGG(GENDER, ',') within group(order by GENDER) "Legal Gender Code" FROM (select distinct GENDER FROM Students) t) t3,
(SELECT LISTAGG(CITY, ',') within group(order by CITY) "City" FROM (select distinct CITY FROM Students) t) t4,
(SELECT LISTAGG(ENTRYCODE, ',') within group(order by ENTRYCODE) "Entry Code" FROM (select distinct ENTRYCODE FROM Students) t) t5,
(SELECT LISTAGG(EXITCODE, ',') within group(order by EXITCODE) "Exit Code" FROM (select distinct EXITCODE FROM Students) t) t6,
(SELECT LISTAGG(LUNCHSTATUS, ',') within group(order by LUNCHSTATUS) "Lunch Status" FROM (select distinct LUNCHSTATUS FROM Students) t) t7,
(SELECT LISTAGG(ETHNICITY, ',') within group(order by ETHNICITY) "Race Code" FROM (select distinct ETHNICITY FROM Students) t) t8,
(SELECT LISTAGG(CLASSOF, ',') within group(order by CLASSOF) "Expected Graduation Year" FROM (select distinct CLASSOF FROM Students) t) t9,
(SELECT LISTAGG(TRACK, ',') within group(order by TRACK) "Track Code" FROM (select distinct TRACK FROM Students) t) t10,
(SELECT LISTAGG(GRADREQSETID, ',') within group(order by GRADREQSETID) "Graduation ID" FROM (select distinct GRADREQSETID FROM Students) t) t11,
(SELECT LISTAGG(ENROLLMENT_SCHOOLID, ',') within group(order by ENROLLMENT_SCHOOLID) "School Key" FROM (select distinct ENROLLMENT_SCHOOLID FROM Students) t) t12,
(SELECT LISTAGG(FEDETHNICITY, ',') within group(order by FEDETHNICITY) "Federal Race Code" FROM (select distinct FEDETHNICITY FROM Students) t) t13,
(SELECT LISTAGG(SUMMERSCHOOLID, ',') within group(order by SUMMERSCHOOLID) "Summer School Key" FROM (select distinct SUMMERSCHOOLID FROM Students) t) t14,
(SELECT LISTAGG(FEDRACEDECLINE, ',') within group(order by FEDRACEDECLINE) "Student Decl to Prov Race Code" FROM (select distinct FEDRACEDECLINE FROM Students) t) t15
Run Code Online (Sandbox Code Playgroud)
这是Oracle Database 11g企业版11.2.0.2.0版 - 64位生产版.
我无法使用STRAGG因为无法进行DISTINCT和ORDER.
性能线性扩展,这很好,因为我正在添加所有感兴趣的列.77K行上面花了3秒钟.只需一次汇总,即.172秒.我这样做有一种方法可以在一次通过中对表中的多个列进行区分.
小智 5
如果要在MULTIPLE列中使用不同的值,想要控制排序顺序,不想使用可能会消失的未记录功能,并且不希望进行多次全表扫描,则可能会发现此构造很有用:
with test_data as
(
select 'A' as col1, 'T_a1' as col2, '123' as col3 from dual
union select 'A', 'T_a1', '456' from dual
union select 'A', 'T_a1', '789' from dual
union select 'A', 'T_a2', '123' from dual
union select 'A', 'T_a2', '456' from dual
union select 'A', 'T_a2', '111' from dual
union select 'A', 'T_a3', '999' from dual
union select 'B', 'T_a1', '123' from dual
union select 'B', 'T_b1', '740' from dual
union select 'B', 'T_b1', '846' from dual
)
select col1
, (select listagg(column_value, ',') within group (order by column_value desc) from table(collect_col2)) as col2s
, (select listagg(column_value, ',') within group (order by column_value desc) from table(collect_col3)) as col3s
from
(
select col1
, collect(distinct col2) as collect_col2
, collect(distinct col3) as collect_col3
from test_data
group by col1
);
Run Code Online (Sandbox Code Playgroud)
即将推出的 Oracle 19c 将DISTINCT支持LISTAGG.
此功能随 19c 一起提供:
Run Code Online (Sandbox Code Playgroud)SQL> select deptno, listagg (distinct sal,', ') within group (order by sal) 2 from scott.emp 3 group by deptno;
编辑:
LISTAGG 聚合函数现在通过使用新的 DISTINCT 关键字支持重复消除。LISTAGG 聚合函数根据 ORDER BY 表达式对查询中每个组的行进行排序,然后将这些值连接成单个字符串。使用新的 DISTINCT 关键字,可以在连接到单个字符串之前从指定表达式中删除重复值。这样就无需在使用聚合 LISTAGG 函数之前创建复杂的查询处理来查找不同的值。使用 DISTINCT 选项,可以直接在 LISTAGG 函数内完成删除重复值的处理。结果是 SQL 更简单、更快、更高效。