如何将一组扁平树变成一棵具有多个叶子的树?

Ble*_*der 5 postgresql tree cte json recursive

我们有这个漂亮的 Postgres 树生成器。然而,它会产生一棵树的切口,而不是一次产生整棵树:

item_id jsonb_pretty
1   {
    "title": "PARENT",
    "item_id": 1,
    "children": {
        "title": "LEVEL 2",
        "item_id": 2,
        "children": {
            "title": "LEVEL 3.2",
            "item_id": 6
        }
    }
}
1   {
    "title": "PARENT",
    "item_id": 1,
    "children": {
        "title": "LEVEL 2",
        "item_id": 2,
        "children": {
            "title": "LEVEL 3.1",
            "item_id": 3,
            "children": {
                "title": "LEVEL 4.1",
                "item_id": 4
            }
        }
    }
}
1   {
    "title": "PARENT",
    "item_id": 1,
    "children": {
        "title": "LEVEL 2",
        "item_id": 2,
        "children": {
            "title": "LEVEL 3.1",
            "item_id": 3,
            "children": {
                "title": "LEVEL 4.2",
                "item_id": 5
            }
        }
    }
}
Run Code Online (Sandbox Code Playgroud)

我想用这样的数组从中获取一个树对象:


1   {
    "title": "PARENT",
    "item_id": 1,
    "children": [{
        "title": "LEVEL 2",
        "item_id": 2,
        "children": [{
            "title": "LEVEL 3.2",
            "item_id": 6
        },
        {
            "title": "LEVEL 3.1",
            "item_id": 3,
            "children": [{
                "title": "LEVEL 4.1",
                "item_id": 4
            },
            {
                "title": "LEVEL 4.2",
                "item_id": 5
            }]
        }]
    }]
}
Run Code Online (Sandbox Code Playgroud)

这是生成器:

CREATE TABLE items (
  item_id     serial PRIMARY KEY,
  title text
);
CREATE TABLE joins (
  id          serial PRIMARY KEY,
  item_id     int,
  child_id    int
);
INSERT INTO items (item_id,title) VALUES
  (1,'PARENT'),
  (2,'LEVEL 2'),
  (3,'LEVEL 3.1'),
  (4,'LEVEL 4.1'),
  (5,'LEVEL 4.2'),
  (6,'LEVEL 3.2');
INSERT INTO joins (item_id, child_id) VALUES
  (1,2),
  (2,3),
  (3,4),
  (3,5),
  (2,6);

WITH RECURSIVE t(item_id, json, level) AS (
        SELECT item_id, to_jsonb(items), 1
        FROM items
        WHERE NOT EXISTS (
                SELECT 2
                FROM joins
                WHERE items.item_id = joins.item_id
        )
        UNION ALL
        SELECT parent.item_id, to_jsonb(parent) || jsonb_build_object( 'children', t.json ),
               level + 1
        FROM t
        JOIN joins AS j
                ON t.item_id = j.child_id
        JOIN items AS parent
                ON j.item_id = parent.item_id
        WHERE level < 7
)
SELECT item_id, jsonb_pretty(json)
FROM t
WHERE item_id = 1;
Run Code Online (Sandbox Code Playgroud)

我如何更改这样的生成器以在 Postgres 13+ 中生成一棵树?

Erw*_*ter 6

递归 CTE ( rCTE) 是跟踪路径并识别树的所有节点的正确工具。但它不允许递归项中的聚合。有一个优雅的替代方案:

递归函数

CREATE OR REPLACE FUNCTION f_item_tree(_item_id int, _level int = 7)
  RETURNS jsonb
  LANGUAGE sql STABLE PARALLEL SAFE AS
$func$
SELECT CASE WHEN _level > 1
            THEN jsonb_agg(sub)
            ELSE CASE WHEN count(*) > 0 THEN to_jsonb(count(*) || ' - stopped recursion at max level') END
       END
FROM  (
   SELECT i.*, CASE WHEN _level > 1 THEN f_item_tree(i.item_id, _level - 1) END AS children
   FROM   joins j
   JOIN   items i ON i.item_id = j.child_id
   WHERE  j.item_id = _item_id
   ORDER  BY i.item_id
   ) sub
$func$;
Run Code Online (Sandbox Code Playgroud)

db<>在这里摆弄

使用默认最大级别 (7) 进行调用:

SELECT jsonb_pretty(jsonb_strip_nulls(to_jsonb(sub))) AS tree
FROM  (
   SELECT *, f_item_tree(item_id) AS children
   FROM   items
   WHERE  item_id = 1  -- root item_id HERE
   ) sub;
Run Code Online (Sandbox Code Playgroud)

产生您想要的结果:

CREATE OR REPLACE FUNCTION f_item_tree(_item_id int, _level int = 7)
  RETURNS jsonb
  LANGUAGE sql STABLE PARALLEL SAFE AS
$func$
SELECT CASE WHEN _level > 1
            THEN jsonb_agg(sub)
            ELSE CASE WHEN count(*) > 0 THEN to_jsonb(count(*) || ' - stopped recursion at max level') END
       END
FROM  (
   SELECT i.*, CASE WHEN _level > 1 THEN f_item_tree(i.item_id, _level - 1) END AS children
   FROM   joins j
   JOIN   items i ON i.item_id = j.child_id
   WHERE  j.item_id = _item_id
   ORDER  BY i.item_id
   ) sub
$func$;
Run Code Online (Sandbox Code Playgroud)

只需 3 个级别即可调用:

SELECT jsonb_pretty(jsonb_strip_nulls(to_jsonb(sub))) AS tree
FROM  (
   SELECT *, f_item_tree(item_id, 3) AS children  -- max level HERE
   FROM   items
   WHERE  item_id = 1  -- root item_id HERE
   ) sub;
Run Code Online (Sandbox Code Playgroud)

请注意递归提前停止时的特殊输出:

SELECT jsonb_pretty(jsonb_strip_nulls(to_jsonb(sub))) AS tree
FROM  (
   SELECT *, f_item_tree(item_id) AS children
   FROM   items
   WHERE  item_id = 1  -- root item_id HERE
   ) sub;
Run Code Online (Sandbox Code Playgroud)

jsonb_pretty()当然,外部是可选的。

您可以将外部调用包装到另一个函数中或将其塞入同一个函数中,可能使用 PL/pgSQL 并使用IF. 这是品味和风格的问题。我在这里保持简单。

这是实际的递归,而rCTE 实际上只是迭代。您需要理解这个概念才能理解该函数:它调用自身。_level每次调用时减少最大值最终会终止递归。

我为最后一个递归级别构建了特殊行为:如果最大级别切断更多子级,该函数会插入一个计数和一个提示。

(可选!)调用jsonb_strip_nulls()...

递归地从给定 JSON 值中删除具有 null 值的所有对象字段。

删除空子字段(带有NULL值)。如果您想保留其他带有 NULL 值的字段,请考虑jsonb_set_lax()在函数中进行替代,就像下面使用纯 SQL 解决方案一样。

理想情况下,您有一个索引joins(item_id, child_id)- 或者该索引隐式附带的UNIQUE约束。(item_id, child_id)

有关的:

普通 SQL

虽然只有一手的关卡,但您还可以通过聚合拼出一系列子查询或非递归 CTE。

查询变得相当冗长,但与原始查询相比,性能仍然不错。这在很大程度上是因为我恢复了最初的查询方法。由于您对给定的根 ID 感兴趣,因此从根开始而不是从叶子开始应该会更有效- 以作为隐喻。这样,我们只选择属于树的一部分的行。

按照您的方式,整个表的所有行都会被处理,只是为了最终过滤来自给定根的一棵树。那是浪费了很多精力。

另外,我在选择所有相关节点后加入items一次,而不是像您那样每次都加入递归术语。

选择树的所有节点后,聚合需要再次自上而下,从叶到根,因为每个较低级别都会集成来自下一个较高级别的聚合数组。

查询最多 4 个级别以覆盖您的测试数据:

WITH RECURSIVE path AS (
   SELECT ARRAY[item_id, child_id] AS path, child_id AS item_id, 2 AS lvl
   FROM   joins
   WHERE  item_id = 1  -- root item_id HERE
   
   UNION ALL
   SELECT p.path || j.child_id, j.child_id, p.lvl + 1
   FROM   path  p
   JOIN   joins j ON j.item_id = p.item_id
   WHERE  p.lvl < 7  -- HARD limit
   )
, tree AS (
   SELECT p.*, to_jsonb(i) AS item
   FROM   path p
   JOIN   items i USING (item_id)
   ORDER  BY lvl, item_id
   )
, lvl4 AS (  -- leaf!
   SELECT path[3] AS item_id, jsonb_agg(item) AS children
   FROM   tree
   WHERE  lvl = 4
   GROUP  BY 1
   )
,  lvl3 AS (
   SELECT path[2] AS item_id
        , jsonb_agg(jsonb_set_lax(item, '{children}', children, true, 'return_target')) AS children
   FROM   tree t
   LEFT   JOIN lvl4 USING (item_id)
   WHERE  lvl = 3
   GROUP  BY 1
   )
,  lvl2 AS ( -- stem!
   SELECT jsonb_agg(jsonb_set_lax(item, '{children}', children, true, 'return_target')) AS children
   FROM   tree t
   LEFT   JOIN lvl3 USING (item_id)
   WHERE  lvl = 2
   )
SELECT i.item_id
     , jsonb_pretty(jsonb_set_lax(to_jsonb(i), '{children}', children, true, 'return_target')) AS tree
FROM   items i
LEFT   JOIN lvl2 ON true
WHERE  item_id = 1;  -- root item_id HERE
Run Code Online (Sandbox Code Playgroud)

产生您想要的结果。也适用于较少的级别。或者(与您原来的不同)甚至是根本没有任何子项的裸根项目。

如何扩展查询以允许更多级别应该是显而易见的。我添加了7 个级别的版本,以适应您的WHERE level < 7需求,以防万一:

db<>在这里摆弄