多个array_agg()调用单个查询

use*_*281 6 sql arrays postgresql aggregate-functions

我正在尝试用我的查询来完成某些事情,但它并没有真正起作用.我的应用程序曾经有一个mongo db,所以应用程序用于在一个字段中获取数组,现在我们不得不更改为Postgres,我不想更改我的应用程序代码以保持v1工作.

为了在Postgres中的1个字段中获取数组,我使用了array_agg()函数.到目前为止这个工作正常.但是,我正处于另一个不同表的字段中需要另一个数组的位置.

例如:

我有我的员工.员工有多个地址,有多个工作日.

SELECT name, age, array_agg(ad.street) FROM employees e 
JOIN address ad ON e.id = ad.employeeid
GROUP BY name, age
Run Code Online (Sandbox Code Playgroud)

现在这对我来说很好,这将导致例如:

| name  | age| array_agg(ad.street)
| peter | 25 | {1st street, 2nd street}|
Run Code Online (Sandbox Code Playgroud)

现在我想在工作日加入另一张桌子,所以我这样做:

SELECT name, age, array_agg(ad.street), arrag_agg(wd.day) FROM employees e 
JOIN address ad ON e.id = ad.employeeid 
JOIN workingdays wd ON e.id = wd.employeeid
GROUP BY name, age
Run Code Online (Sandbox Code Playgroud)

这导致:

| peter | 25 | {1st street, 1st street, 1st street, 1st street, 1st street, 2nd street, 2nd street, 2nd street, 2nd street, 2nd street}| "{Monday,Tuesday,Wednesday,Thursday,Friday,Monday,Tuesday,Wednesday,Thursday,Friday}
Run Code Online (Sandbox Code Playgroud)

但我需要它结果:

| peter | 25 | {1st street, 2nd street}| {Monday,Tuesday,Wednesday,Thursday,Friday}
Run Code Online (Sandbox Code Playgroud)

我理解它与我的连接有关,因为多个连接行多个但我不知道如何实现这一点,任何人都可以给我正确的提示吗?

Erw*_*ter 14

DISTINCT通常用于修复从内部腐烂的查询,这通常很慢和/或不正确.不要将行乘以开头,然后您不必在结尾处排除不需要的重复项.

连接到多个n表("有很多")会立即将结果集中的行相乘.这就像一个CROSS JOIN笛卡儿积 委派代表:

有各种方法可以避免这种错误.

首先聚合,稍后加入

从技术上讲,只要在聚合之前一次连接到一个包含多行的表,查询就会起作用:

SELECT e.id, e.name, e.age, e.streets, arrag_agg(wd.day) AS days
FROM  (
   SELECT e.id, e.name, e.age, array_agg(ad.street) AS streets
   FROM   employees e 
   JOIN   address  ad ON ad.employeeid = e.id
   GROUP  BY e.id    -- id enough if it is defined PK
   ) e
JOIN   workingdays wd ON wd.employeeid = e.id
GROUP  BY e.id, e.name, e.age;
Run Code Online (Sandbox Code Playgroud)

这也是最好的,包括主键idGROUP BY,因为nameage不一定是唯一的.你可能会错误地合并两名员工.

但是你可以在加入之前在子查询中聚合,除非你有选择WHERE条件employees:

SELECT e.id, e.name, e.age, ad.streets, arrag_agg(wd.day) AS days
FROM   employees e 
JOIN  (
   SELECT employeeid, array_agg(ad.street) AS streets
   FROM   address
   GROUP  BY 1
   ) ad ON ad.employeeid = e.id
JOIN   workingdays wd ON e.id = wd.employeeid
GROUP  BY e.id, e.name, e.age, ad.streets;
Run Code Online (Sandbox Code Playgroud)

或聚合两者:

SELECT name, age, ad.streets, wd.days
FROM   employees e 
JOIN  (
   SELECT employeeid, array_agg(ad.street) AS streets
   FROM   address
   GROUP  BY 1
   ) ad ON ad.employeeid = e.id
JOIN  (
   SELECT employeeid, arrag_agg(wd.day) AS days
   FROM   workingdays
   GROUP  BY 1
   ) wd ON wd.employeeid = e.id;
Run Code Online (Sandbox Code Playgroud)

如果检索基表中的所有或大多数行,则最后一个通常会更快.

请注意,使用JOIN而不是LEFT JOIN从结果中删除没有地址没有工作日的员工.这可能是也可能不是.切换到LEFT JOIN保留结果中的所有员工.

相关子查询/ LATERAL加入

对于一个小的选择,我会考虑相关的子查询:

SELECT name, age
    , (SELECT array_agg(street) FROM address WHERE employeeid = e.id) AS streets
    , (SELECT arrag_agg(day) FROM workingdays WHERE employeeid = e.id) AS days
FROM   employees e
WHERE  e.namer = 'peter';  -- very selective
Run Code Online (Sandbox Code Playgroud)

或者,使用Postgres 9.3或更高版本,您可以使用LATERAL联接:

SELECT e.name, e.age, a.streets, w.days
FROM   employees e
LEFT   JOIN LATERAL (
   SELECT array_agg(street) AS streets
   FROM   address
   WHERE  employeeid = e.id
   GROUP  BY 1
   ) a ON true
LEFT   JOIN LATERAL (
   SELECT array_agg(day) AS days
   FROM   workingdays
   WHERE  employeeid = e.id
   GROUP  BY 1
   ) w ON true
WHERE  e.name = 'peter';  -- very selective
Run Code Online (Sandbox Code Playgroud)

任一查询都会保留结果中的所有员工.

  • 嗨,谢谢,非常清楚的解释。我可以继续感谢你:) (2认同)