从表中查找“n”个连续的空闲数字

boo*_*biq 19 postgresql window-functions group-by gaps-and-islands

我有一些这样的数字表(状态是免费的或分配的)

id_set 号码状态         
-----------------------
1 000001 已分配
1 000002 免费
1 000003 已分配
1 000004 免费
1 000005 免费
1 000006 已分配
1 000007 已分配
1 000008 免费
1 000009 免费
1 000010 免费
1 000011 已分配
1 000012 已分配
1 000013 已分配
1 000014 免费
1 000015 已分配

我需要找到“n”个连续数字,因此对于 n = 3,查询将返回

1 000008 免费
1 000009 免费
1 000010 免费

它应该只返回每个 id_set 的第一个可能的组(实际上,它只会为每个查询的 id_set 执行)

我正在检查 WINDOW 函数,尝试了一些类似的查询COUNT(id_number) OVER (PARTITION BY id_set ROWS UNBOUNDED PRECEDING),但这就是我得到的:) 我想不出逻辑,如何在 Postgres 中做到这一点。

我正在考虑使用 WINDOW 函数创建虚拟列,为 status = 'FREE' 的每个数字计算前一行,然后选择第一个数字,其中 count 等于我的“n”数字。

或者可以按状态对数字进行分组,但只能从一个分配到另一个分配,并且只选择至少包含“n”个数字的组

编辑

我找到了这个查询(并对其进行了一些更改)

WITH q AS
(
  SELECT *,
         ROW_NUMBER() OVER (PARTITION BY id_set, status ORDER BY number) AS rnd,
         ROW_NUMBER() OVER (PARTITION BY id_set ORDER BY number) AS rn
  FROM numbers
)
SELECT id_set,
       MIN(number) AS first_number,
       MAX(number) AS last_number,
       status,
       COUNT(number) AS numbers_count
FROM q
GROUP BY id_set,
         rnd - rn,
         status
ORDER BY
     first_number
Run Code Online (Sandbox Code Playgroud)

它产生一组免费/分配的数字,但我希望只有第一组满足条件的所有数字

SQL小提琴

And*_*y M 20

这是一个问题。假设同一id_set集合中没有间隙或重复:

WITH partitioned AS (
  SELECT
    *,
    number - ROW_NUMBER() OVER (PARTITION BY id_set) AS grp
  FROM atable
  WHERE status = 'FREE'
),
counted AS (
  SELECT
    *,
    COUNT(*) OVER (PARTITION BY id_set, grp) AS cnt
  FROM partitioned
)
SELECT
  id_set,
  number
FROM counted
WHERE cnt >= 3
;
Run Code Online (Sandbox Code Playgroud)

这是此查询的 SQL Fiddle 演示*链接:http : //sqlfiddle.com/# !1 / a2633/1

更新

要仅返回一组,您可以再添加一轮排名:

WITH partitioned AS (
  SELECT
    *,
    number - ROW_NUMBER() OVER (PARTITION BY id_set) AS grp
  FROM atable
  WHERE status = 'FREE'
),
counted AS (
  SELECT
    *,
    COUNT(*) OVER (PARTITION BY id_set, grp) AS cnt
  FROM partitioned
),
ranked AS (
  SELECT
    *,
    RANK() OVER (ORDER BY id_set, grp) AS rnk
  FROM counted
  WHERE cnt >= 3
)
SELECT
  id_set,
  number
FROM ranked
WHERE rnk = 1
;
Run Code Online (Sandbox Code Playgroud)

这也是一个演示:http : //sqlfiddle.com/# ! 1/ a2633/2

如果你需要让一组id_set,改变RANK()这样的电话:

RANK() OVER (PARTITION BY id_set ORDER BY grp) AS rnk
Run Code Online (Sandbox Code Playgroud)

此外,您可以使查询返回最小的匹配集(即首先尝试返回第一组正好三个连续数字,如果存在,否则为四个、五个等),如下所示:

RANK() OVER (ORDER BY cnt, id_set, grp) AS rnk
Run Code Online (Sandbox Code Playgroud)

或者像这样(每个一个id_set):

RANK() OVER (PARTITION BY id_set ORDER BY cnt, grp) AS rnk
Run Code Online (Sandbox Code Playgroud)

* 此答案中链接的 SQL Fiddle 演示使用 9.1.8 实例,因为 9.2.1 实例目前似乎不起作用。


Erw*_*ter 10

一个简单快速的变体:

SELECT min(number) AS first_number, count(*) AS ct_free
FROM (
    SELECT *, number - row_number() OVER (PARTITION BY id_set ORDER BY number) AS grp
    FROM   tbl
    WHERE  status = 'FREE'
    ) x
GROUP  BY grp
HAVING count(*) >= 3  -- minimum length of sequence only goes here
ORDER  BY grp
LIMIT  1;
Run Code Online (Sandbox Code Playgroud)
  • 需要无间隙的数字序列number(如问题中所提供)。

  • 适用于任何数量的可能值status之外'FREE',即使有NULL

  • 主要特点是减去row_number()number消除非限定行之后。连续的数字以相同的方式结束grp- 并且grp也保证按升序排列

  • 然后你可以GROUP BY grp计算成员。由于您似乎想要第一次出现,ORDER BY grp LIMIT 1并且您获得了序列的起始位置和长度(可以是 >= n)。

行集

要获得一组实际的数字,请不要再查表。便宜得多generate_series()

SELECT generate_series(first_number, first_number + ct_free - 1)
    -- generate_series(first_number, first_number + 3 - 1) -- only 3
FROM  (
   SELECT min(number) AS first_number, count(*) AS ct_free
   FROM  (
      SELECT *, number - row_number() OVER (PARTITION BY id_set ORDER BY number) AS grp
      FROM   tbl
      WHERE  status = 'FREE'
      ) x
   GROUP  BY grp
   HAVING count(*) >= 3
   ORDER  BY grp
   LIMIT  1
   ) y;
Run Code Online (Sandbox Code Playgroud)

如果您确实想要一个带有前导零的字符串,就像您在示例值中显示的那样,请to_char()FM(fill mode) 修饰符一起使用:

SELECT to_char(generate_series(8, 11), 'FM000000')
Run Code Online (Sandbox Code Playgroud)

带有扩展测试用例和两个查询的SQL Fiddle

密切相关的答案:


JNK*_*JNK 8

这是执行此操作的一种相当通用的方法。

请记住,这取决于您的number列是否连续。如果它不是 Window 函数和/或 CTE 类型解决方案,则可能需要:

SELECT 
    number
FROM
    mytable m
CROSS JOIN
   (SELECT 3 AS consec) x
WHERE 
    EXISTS
       (SELECT 1 
        FROM mytable
        WHERE number = m.number - x.consec + 1
        AND status = 'FREE')
    AND NOT EXISTS
       (SELECT 1 
        FROM mytable
        WHERE number BETWEEN m.number - x.consec + 1 AND m.number
        AND status = 'ASSIGNED')
Run Code Online (Sandbox Code Playgroud)

  • 我冒昧地修复了 Postgres 的语法。第一个`EXISTS`可以简化。由于我们只需要确保 *any* n 个较早的行存在,我们可以删除 `AND status = 'FREE'`。我会将第二个 `EXISTS` 中的条件更改为 `status <> 'FREE'` 以加强它在未来添加的选项。 (2认同)

ype*_*eᵀᴹ 5

这将仅返回 3 个数字中的第一个。它不要求 的值number是连续的。在SQL-Fiddle测试:

WITH cte3 AS
( SELECT
    *,
    COUNT(CASE WHEN status = 'FREE' THEN 1 END) 
        OVER (PARTITION BY id_set ORDER BY number
              ROWS BETWEEN CURRENT ROW AND 2 FOLLOWING)
      AS cnt
  FROM atable
)
SELECT
  id_set, number
FROM cte3
WHERE cnt = 3 ;
Run Code Online (Sandbox Code Playgroud)

这将显示所有数字(其中有 3 个或更多连续'FREE'位置):

WITH cte3 AS
( SELECT
    *,
    COUNT(CASE WHEN status = 'FREE' THEN 1 END) 
        OVER (PARTITION BY id_set ORDER BY number
              ROWS BETWEEN CURRENT ROW AND 2 FOLLOWING)
      AS cnt
  FROM atable
)
, cte4 AS
( SELECT
    *, 
    MAX(cnt) 
        OVER (PARTITION BY id_set ORDER BY number
              ROWS BETWEEN 2 PRECEDING AND CURRENT ROW)
      AS maxcnt
  FROM cte3
)
SELECT
  id_set, number
FROM cte4
WHERE maxcnt >= 3 ;
Run Code Online (Sandbox Code Playgroud)