boo*_*biq 19 postgresql window-functions group-by gaps-and-islands
我有一些这样的数字表(状态是免费的或分配的)
id_set 号码状态 ----------------------- 1 000001 已分配 1 000002 免费 1 000003 已分配 1 000004 免费 1 000005 免费 1 000006 已分配 1 000007 已分配 1 000008 免费 1 000009 免费 1 000010 免费 1 000011 已分配 1 000012 已分配 1 000013 已分配 1 000014 免费 1 000015 已分配
我需要找到“n”个连续数字,因此对于 n = 3,查询将返回
1 000008 免费 1 000009 免费 1 000010 免费
它应该只返回每个 id_set 的第一个可能的组(实际上,它只会为每个查询的 id_set 执行)
我正在检查 WINDOW 函数,尝试了一些类似的查询COUNT(id_number) OVER (PARTITION BY id_set ROWS UNBOUNDED PRECEDING),但这就是我得到的:) 我想不出逻辑,如何在 Postgres 中做到这一点。
我正在考虑使用 WINDOW 函数创建虚拟列,为 status = 'FREE' 的每个数字计算前一行,然后选择第一个数字,其中 count 等于我的“n”数字。
或者可以按状态对数字进行分组,但只能从一个分配到另一个分配,并且只选择至少包含“n”个数字的组
编辑
我找到了这个查询(并对其进行了一些更改)
WITH q AS
(
SELECT *,
ROW_NUMBER() OVER (PARTITION BY id_set, status ORDER BY number) AS rnd,
ROW_NUMBER() OVER (PARTITION BY id_set ORDER BY number) AS rn
FROM numbers
)
SELECT id_set,
MIN(number) AS first_number,
MAX(number) AS last_number,
status,
COUNT(number) AS numbers_count
FROM q
GROUP BY id_set,
rnd - rn,
status
ORDER BY
first_number
Run Code Online (Sandbox Code Playgroud)
它产生一组免费/分配的数字,但我希望只有第一组满足条件的所有数字
And*_*y M 20
这是一个缺口和孤岛问题。假设同一id_set集合中没有间隙或重复:
WITH partitioned AS (
SELECT
*,
number - ROW_NUMBER() OVER (PARTITION BY id_set) AS grp
FROM atable
WHERE status = 'FREE'
),
counted AS (
SELECT
*,
COUNT(*) OVER (PARTITION BY id_set, grp) AS cnt
FROM partitioned
)
SELECT
id_set,
number
FROM counted
WHERE cnt >= 3
;
Run Code Online (Sandbox Code Playgroud)
这是此查询的 SQL Fiddle 演示*链接:http : //sqlfiddle.com/# !1 / a2633/1。
更新
要仅返回一组,您可以再添加一轮排名:
WITH partitioned AS (
SELECT
*,
number - ROW_NUMBER() OVER (PARTITION BY id_set) AS grp
FROM atable
WHERE status = 'FREE'
),
counted AS (
SELECT
*,
COUNT(*) OVER (PARTITION BY id_set, grp) AS cnt
FROM partitioned
),
ranked AS (
SELECT
*,
RANK() OVER (ORDER BY id_set, grp) AS rnk
FROM counted
WHERE cnt >= 3
)
SELECT
id_set,
number
FROM ranked
WHERE rnk = 1
;Run Code Online (Sandbox Code Playgroud)
这也是一个演示:http : //sqlfiddle.com/# ! 1/ a2633/2。
如果你需要让一组按id_set,改变RANK()这样的电话:
RANK() OVER (PARTITION BY id_set ORDER BY grp) AS rnkRun Code Online (Sandbox Code Playgroud)
此外,您可以使查询返回最小的匹配集(即首先尝试返回第一组正好三个连续数字,如果存在,否则为四个、五个等),如下所示:
RANK() OVER (ORDER BY cnt, id_set, grp) AS rnkRun Code Online (Sandbox Code Playgroud)
或者像这样(每个一个id_set):
RANK() OVER (PARTITION BY id_set ORDER BY cnt, grp) AS rnkRun Code Online (Sandbox Code Playgroud)
* 此答案中链接的 SQL Fiddle 演示使用 9.1.8 实例,因为 9.2.1 实例目前似乎不起作用。
Erw*_*ter 10
一个简单快速的变体:
SELECT min(number) AS first_number, count(*) AS ct_free
FROM (
SELECT *, number - row_number() OVER (PARTITION BY id_set ORDER BY number) AS grp
FROM tbl
WHERE status = 'FREE'
) x
GROUP BY grp
HAVING count(*) >= 3 -- minimum length of sequence only goes here
ORDER BY grp
LIMIT 1;
Run Code Online (Sandbox Code Playgroud)
需要无间隙的数字序列number(如问题中所提供)。
适用于任何数量的可能值status之外'FREE',即使有NULL。
主要特点是减去row_number()从number消除非限定行之后。连续的数字以相同的方式结束grp- 并且grp也保证按升序排列。
然后你可以GROUP BY grp计算成员。由于您似乎想要第一次出现,ORDER BY grp LIMIT 1并且您获得了序列的起始位置和长度(可以是 >= n)。
要获得一组实际的数字,请不要再查表。便宜得多generate_series():
SELECT generate_series(first_number, first_number + ct_free - 1)
-- generate_series(first_number, first_number + 3 - 1) -- only 3
FROM (
SELECT min(number) AS first_number, count(*) AS ct_free
FROM (
SELECT *, number - row_number() OVER (PARTITION BY id_set ORDER BY number) AS grp
FROM tbl
WHERE status = 'FREE'
) x
GROUP BY grp
HAVING count(*) >= 3
ORDER BY grp
LIMIT 1
) y;
Run Code Online (Sandbox Code Playgroud)
如果您确实想要一个带有前导零的字符串,就像您在示例值中显示的那样,请to_char()与FM(fill mode) 修饰符一起使用:
SELECT to_char(generate_series(8, 11), 'FM000000')
Run Code Online (Sandbox Code Playgroud)
带有扩展测试用例和两个查询的SQL Fiddle。
密切相关的答案:
这是执行此操作的一种相当通用的方法。
请记住,这取决于您的number列是否连续。如果它不是 Window 函数和/或 CTE 类型解决方案,则可能需要:
SELECT
number
FROM
mytable m
CROSS JOIN
(SELECT 3 AS consec) x
WHERE
EXISTS
(SELECT 1
FROM mytable
WHERE number = m.number - x.consec + 1
AND status = 'FREE')
AND NOT EXISTS
(SELECT 1
FROM mytable
WHERE number BETWEEN m.number - x.consec + 1 AND m.number
AND status = 'ASSIGNED')
Run Code Online (Sandbox Code Playgroud)
这将仅返回 3 个数字中的第一个。它不要求 的值number是连续的。在SQL-Fiddle测试:
WITH cte3 AS
( SELECT
*,
COUNT(CASE WHEN status = 'FREE' THEN 1 END)
OVER (PARTITION BY id_set ORDER BY number
ROWS BETWEEN CURRENT ROW AND 2 FOLLOWING)
AS cnt
FROM atable
)
SELECT
id_set, number
FROM cte3
WHERE cnt = 3 ;
Run Code Online (Sandbox Code Playgroud)
这将显示所有数字(其中有 3 个或更多连续'FREE'位置):
WITH cte3 AS
( SELECT
*,
COUNT(CASE WHEN status = 'FREE' THEN 1 END)
OVER (PARTITION BY id_set ORDER BY number
ROWS BETWEEN CURRENT ROW AND 2 FOLLOWING)
AS cnt
FROM atable
)
, cte4 AS
( SELECT
*,
MAX(cnt)
OVER (PARTITION BY id_set ORDER BY number
ROWS BETWEEN 2 PRECEDING AND CURRENT ROW)
AS maxcnt
FROM cte3
)
SELECT
id_set, number
FROM cte4
WHERE maxcnt >= 3 ;
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
28160 次 |
| 最近记录: |