使用row_to_json进行Postgres递归查询

Ron*_*yek 6 sql postgresql recursion json postgresql-json

我在postgres 9.3.5中有一个表格,如下所示:

CREATE TABLE customer_area_node
(
  id bigserial NOT NULL,
  customer_id integer NOT NULL,
  parent_id bigint,
  name text,
  description text,

  CONSTRAINT customer_area_node_pkey PRIMARY KEY (id)
)
Run Code Online (Sandbox Code Playgroud)

我查询:

WITH RECURSIVE c AS (
       SELECT *, 0 as level, name as path FROM customer_area_node WHERE customer_id = 2 and parent_id is null
       UNION ALL
       SELECT customer_area_node.*, 
       c.level + 1 as level, 
       c.path || '/' || customer_area_node.name as path
  FROM customer_area_node 
  join c ON customer_area_node.parent_id = c.id
)
SELECT * FROM c ORDER BY path;
Run Code Online (Sandbox Code Playgroud)

这似乎可以构建像building1/floor1/room1,building1/floor1/room2等路径.

我希望能够做到的很容易就是把它转换成代表树结构的json,我已经告诉我可以用row_to_json做.

作为一个合理的替代方案,我可以将数据格式化为更有效的机制,这样我实际上可以轻松地将其转换为实际的树结构而不需要在/上使用大量的string.splits.

使用row_to_json有一个相当简单的方法吗?

poz*_*ozs 7

你不能用通常的递归CTE 做到这一点,因为几乎不可能在其层次结构中设置一个json值.但你可以逆转:从树叶开始构建树,直到它的根:

-- calculate node levels
WITH RECURSIVE c AS (
    SELECT *, 0 as lvl
    FROM customer_area_node
    -- use parameters here, to select the root first
    WHERE customer_id = 2 AND parent_id IS NULL
  UNION ALL
    SELECT customer_area_node.*, c.lvl + 1 as lvl
    FROM customer_area_node 
    JOIN c ON customer_area_node.parent_id = c.id
),
-- select max level
maxlvl AS (
  SELECT max(lvl) maxlvl FROM c
),
-- accumulate children
j AS (
    SELECT c.*, json '[]' children -- at max level, there are only leaves
    FROM c, maxlvl
    WHERE lvl = maxlvl
  UNION ALL
    -- a little hack, because PostgreSQL doesn't like aggregated recursive terms
    SELECT (c).*, array_to_json(array_agg(j)) children
    FROM (
      SELECT c, j
      FROM j
      JOIN c ON j.parent_id = c.id
    ) v
    GROUP BY v.c
)
-- select only root
SELECT row_to_json(j) json_tree
FROM j
WHERE lvl = 0;
Run Code Online (Sandbox Code Playgroud)

这甚至可以用于PostgreSQL 9.2+

SQLFiddle

更新:一个变体,它应该处理流氓叶节点(位于1和最大级别之间的级别):

WITH RECURSIVE c AS (
    SELECT *, 0 as lvl
    FROM   customer_area_node
    WHERE  customer_id = 1 AND parent_id IS NULL
  UNION ALL
    SELECT customer_area_node.*, c.lvl + 1
    FROM   customer_area_node 
    JOIN   c ON customer_area_node.parent_id = c.id
),
maxlvl AS (
  SELECT max(lvl) maxlvl FROM c
),
j AS (
    SELECT c.*, json '[]' children
    FROM   c, maxlvl
    WHERE  lvl = maxlvl
  UNION ALL
    SELECT   (c).*, array_to_json(array_agg(j) || array(SELECT r
                                                        FROM   (SELECT l.*, json '[]' children
                                                                FROM   c l, maxlvl
                                                                WHERE  l.parent_id = (c).id
                                                                AND    l.lvl < maxlvl
                                                                AND    NOT EXISTS (SELECT 1
                                                                                   FROM   c lp
                                                                                   WHERE  lp.parent_id = l.id)) r)) children
    FROM     (SELECT c, j
              FROM   c
              JOIN   j ON j.parent_id = c.id) v
    GROUP BY v.c
)
SELECT row_to_json(j) json_tree
FROM   j
WHERE  lvl = 0;
Run Code Online (Sandbox Code Playgroud)

应该也适用于PostgreSQL 9.2+,但我无法测试.(我现在只能测试9.5+).

这些解决方案可以处理任何分层表中的任何列,但始终会将int类型化的lvlJSON属性附加到其输出中.

http://rextester.com/YNU7932


Dav*_*lot 5

很抱歉很晚的答案,但我想我找到了一个优雅的解决方案,可以成为这个问题的公认答案.

基于@pozs发现的令人敬畏的"小黑客",我提出了一个解决方案:

  • 用很少的代码解决"流氓叶子"的情况(利用NOT EXISTS谓词)
  • 避免整个级别计算/条件的东西
WITH RECURSIVE customer_area_tree("id", "customer_id", "parent_id", "name", "description", "children") AS (
  -- tree leaves (no matching children)
  SELECT c.*, json '[]'
  FROM customer_area_node c
  WHERE NOT EXISTS(SELECT * FROM customer_area_node AS hypothetic_child WHERE hypothetic_child.parent_id = c.id)

  UNION ALL

  -- pozs's awesome "little hack"
  SELECT (parent).*, json_agg(child) AS "children"
  FROM (
    SELECT parent, child
    FROM customer_area_tree AS child
    JOIN customer_area_node parent ON parent.id = child.parent_id
  ) branch
  GROUP BY branch.parent
)
SELECT json_agg(t)
FROM customer_area_tree t
LEFT JOIN customer_area_node AS hypothetic_parent ON(hypothetic_parent.id = t.parent_id)
WHERE hypothetic_parent.id IS NULL
Run Code Online (Sandbox Code Playgroud)

更新:

使用非常简单的数据进行测试,它确实有效,但正如posz在评论中指出的那样,使用他的样本数据,一些流氓叶子节点被遗忘了.但是,我发现对于更复杂的数据,前面的答案也不起作用,因为只有具有"最大级别"叶节点的共同祖先的流氓叶节点被捕获(当"1.2.5.8"不存在时," 1.2.4"和"1.2.5"不存在,因为它们没有任何"最高级"叶节点的共同祖先.

因此,这是一个新的命题,通过提取NOT EXISTS子请求并将其作为内部UNION利用UNION重复数据删除功能(利用jsonb比较能力)将posz的工作与我的工作相结合:

<!-- language: sql -->
WITH RECURSIVE
c_with_level AS (

    SELECT *, 0 as lvl
    FROM   customer_area_node
    WHERE  parent_id IS NULL

    UNION ALL

    SELECT child.*, parent.lvl + 1
    FROM   customer_area_node child
    JOIN   c_with_level parent ON parent.id = child.parent_id
),
maxlvl AS (
  SELECT max(lvl) maxlvl FROM c_with_level
),
c_tree AS (
    SELECT c_with_level.*, jsonb '[]' children
    FROM   c_with_level, maxlvl
    WHERE  lvl = maxlvl

    UNION 
    (
        SELECT (branch_parent).*, jsonb_agg(branch_child)
        FROM (
            SELECT branch_parent, branch_child
            FROM c_with_level branch_parent
            JOIN c_tree branch_child ON branch_child.parent_id = branch_parent.id
        ) branch
        GROUP BY branch.branch_parent

        UNION

        SELECT c.*, jsonb '[]' children
        FROM   c_with_level c
        WHERE  NOT EXISTS (SELECT 1 FROM c_with_level hypothetical_child WHERE hypothetical_child.parent_id = c.id)
    )
)
SELECT jsonb_pretty(row_to_json(c_tree)::jsonb)
FROM c_tree
WHERE lvl = 0;
Run Code Online (Sandbox Code Playgroud)

http://rextester.com/SMM38494上测试;)

  • 对于任何感兴趣的人,我修改了 David 的第二次尝试,改为 [返回嵌套 JSON 对象](https://rextester.com/HXKQ64090)。在 Python 中,我想将 JSON 反序列化为嵌套字典 - 为什么要在代码中执行 SQL 中可以完成的操作;)我也希望在 javascript 中对数据结构进行类似的访问。谢谢@DavidGuillot! (2认同)
  • 这种方法很好,但不能涵盖存在多个根项并且一棵树比另一棵树更深的情况。它在不同子级的根级别上生成重复项:http://sqlfiddle.com/#!17/022f80/8 (2认同)