PostgreSQL 相当于 Oracle 的 ANY_VALUE(...) KEEP (DENSE_RANK FIRST/LAST ORDER BY ...)

Question

PostgreSQL 相当于 Oracle 的 ANY_VALUE(...) KEEP (DENSE_RANK FIRST/LAST ORDER BY ...)

Wil*_*son 4 postgresql oracle aggregate group-by greatest-n-per-group

Oracle SQL 中有一项技术可用于简化聚合查询：

聚合特定列，但使用 SELECT 列表中的简单计算列从不同列获取信息。

--Oracle
--For a given country, what city has the highest population? (where the country has more than one city)
--Include the city name as a column.
select
    country,
    count(*),
    max(population),
    any_value(city) keep (dense_rank first order by population desc)   --<<--
from
    cities
group by
    country
having
    count(*) > 1

Run Code Online (Sandbox Code Playgroud)

数据库<>小提琴

如上所示，以下列可以带入城市名称，即使城市名称不在 GROUP BY 中：

 any_value(city) keep (dense_rank first order by population desc)

Run Code Online (Sandbox Code Playgroud)

有多种方法可以使用 SQL 来实现此类操作。我正在 PostgreSQL 中寻找一种解决方案，让我可以在计算列中完成此操作 - 所有这些都在单个 SELECT 查询中（没有子查询、联接、WITH 等）。

问题：PostgreSQL 中是否有与 Oracle 相同的功能ANY_VALUE(...) KEEP (DENSE_RANK FIRST/LAST ORDER BY ...)？

有关的：

YouTube：KEEP 子句将使您的 SQL 查询变得简单 (Oracle)
Stack Overflow：Oracle FIRST/LAST 中 KEEP 的说明
db-orientation.com：ANY_VALUE和 FIRST/LAST（保留）
DBA Stack Exchange：如何请求 PostgreSQL 的增强功能

编辑：

我改为MAX()，ANY_VALUE()因为我认为ANY_VALUE()更容易阅读。

, city desc可以通过添加来打破关系order by，使其具有确定性：

any_value(city) keep (dense_rank first order by population desc, city desc)

Run Code Online (Sandbox Code Playgroud)

Answer 1

Erw*_*ter 7

`first_last_agg`

附加模块first_last_agg可以让这个变得简单。它可以从 apt.postgresql.org（以及其他）获得。阅读Postgres Wiki 中的说明。每个数据库安装一次：

CREATE EXTENSION first_last_agg;

Run Code Online (Sandbox Code Playgroud)

它提供了两个聚合函数：first()和last()。

大多数托管服务不提供该模块。如果您无法安装它，下一个最佳选择是自己创建聚合函数，如 Postgres Wiki 以及下面我的小提琴中所示。或者在这里：

获取包含每列最后一个非 NULL 值集的行

但模块first_last_agg 的C 实现速度更快。

然后：

SELECT country
     , count(*) AS ct_cities
     , max(population) AS highest_population
     , last(city ORDER BY population, city) AS biggest_city  -- !
FROM   cities
GROUP  BY country
HAVING count(*) > 1;

Run Code Online (Sandbox Code Playgroud)

小提琴

与...一样：

 , first(city ORDER BY population DESC NULLS LAST, city DESC NULLS LAST) AS biggest_city

Run Code Online (Sandbox Code Playgroud)

为什么NULLS LAST？看：

按列 ASC 排序，但首先是 NULL 值？

要么报告人口最多的城市，要么按字母顺序最后排列名称 - 就像您的原始名称一样。

无需附加模块

如果无法安装附加模块。而你却依然坚持：

所有这些都在单个 SELECT 查询中（无子查询、连接、WITH 等）。

DISTINCT ON与窗口函数结合也可以做到这一点：

SELECT DISTINCT ON (country)
       country
     , count(*) OVER (PARTITION BY country) AS ct_cities
     , population AS highest_population
     , city AS biggest_city
FROM   cities c
ORDER  BY country, population DESC NULLS LAST, city DESC NULLS LAST;

Run Code Online (Sandbox Code Playgroud)

看：

计算每个影响者随时间推移的追随者增长情况

同时消除只有一个条目的国家：

SELECT DISTINCT ON (country)
       country
     , count(*) OVER (PARTITION BY country) AS ct_cities
     , population AS highest_population
     , city AS biggest_city
FROM   cities c
WHERE  EXISTS (SELECT FROM cities c1 WHERE c1.country = c.country AND c1.ctid <> c.ctid)
ORDER  BY country, population DESC NULLS LAST, city DESC NULLS LAST;

Run Code Online (Sandbox Code Playgroud)

使用你的 PK，而不是ctid如果你有 PK。看：

系统列“ctid”用于标识要删除的行是否合法？

如果允许子查询，则：

SELECT *
FROM  (
   SELECT DISTINCT ON (country)
          country
        , count(*) OVER (PARTITION BY country) AS ct_cities
        , population AS highest_population
        , city AS biggest_city
   FROM   cities c
   ORDER  BY country, population DESC NULLS LAST, city DESC NULLS LAST
   ) sub
WHERE  ct_cities > 1;

Run Code Online (Sandbox Code Playgroud)

(array_agg(city ORDER BY population DESC NULLS LAST))[1]通常在每个国家/地区超过几行时表现不佳。聚合大数组，仅获取第一个元素的成本很高。查看性能基准：

选择每个 GROUP BY 组中的第一行？

归档时间：	2 年，8 月前
查看次数：	1707 次
最近记录：	2 年，7 月前