Postgres ltree查询，计算每个树级别上的已连接项

Question

Postgres ltree查询，计算每个树级别上的已连接项

mal*_*lix 5 sql postgresql ltree postgresql-9.3

我有3张桌子：

位置，位置描述，其中包含每个位置等的语言以及1用于商店的语言。

位置描述表也将层次结构保存在ltree路径字段中，如下所示：

city.district1  
city.district1.area1  
city.district1.area2  
...  
city.district2.area1    
city.district2.area2  
...  
city(n).district(n).area(n)

Run Code Online (Sandbox Code Playgroud)

STORE表包含一个外键location_id，以引用其所属的位置。

因此，我想做的是获取带有每个节点的存储数的树，并按此树排序。

例如：

city.district3 (10)   
city.district3.area2 (6)  
city.district3.area1 (4)  
...  
city.district2 (9)    
city.district2.area1 (5)    
city.district2.area3 (3)  
city.district2.area2 (1)    
...
city.districtN (5)  
city.districtN.area2 (3)  
city.districtN.area1 (2)

Run Code Online (Sandbox Code Playgroud)

到目前为止，我所做的只是获取树和计数（存储），但仅针对区域而非区域，并且没有想要的顺序。

SELECT locdesc.title, COUNT(store.store_id) as totalStores, locdesc.path, nlevel(locdesc.path) as lvl
FROM st_location loc
    JOIN st_location_desc locdesc ON locdesc.location_id = loc.location_id
    LEFT JOIN st_store store ON store.location_id = loc.location_id
WHERE path ~ 'london.*{1,2}'
GROUP BY locdesc.path, locdesc.title
ORDER BY path

Run Code Online (Sandbox Code Playgroud)

================================================== =============================== EDIT1：

更新了我的查询，我得到了父母和孩子的总计记录（我相信有一种更有效的方法）。我仍然缺少订单：

SELECT locdesc.title, COUNT(s.store_id) as totalParent, COUNT(store.store_id) as totalChild, locdesc.path, nlevel(locdesc.path) as lvl
FROM st_location loc
JOIN st_location_desc locdesc ON locdesc.location_id = loc.location_id
    LEFT JOIN 
    (
        select store.store_id, loc.parent
        from st_store store
            join st_location loc on loc.location_id = store.location_id
    ) s
    ON s.parent = loc.location_id
LEFT JOIN st_store store on store.location_id = loc.location_id
WHERE path ~ 'london.*{1,2}'
GROUP BY loc.location_id, locdesc.title, locdesc.path
ORDER BY path asc, totalParent desc, totalChild desc

Run Code Online (Sandbox Code Playgroud)

Answer 1

Pie*_*ens 0

如果您正在寻找大量商店，那么我强烈建议您阅读 Jeff Moden 的《类固醇层次结构》第 1 部分和第 2 部分。他比较了将层次结构存储为邻接列表（每个子级都有一个父级外键）或嵌套集的相对优势。邻接列表的优点是插入速度更快，对用户来说更直观；对于可能在层次结构上运行的许多报表来说，嵌套集速度更快。

\n\n

第 1 部分提供了一种高效的面向集合的算法，用于从邻接列表创建嵌套集合层次结构表，我在此处复制了该算法。它在某些方面是 SQL Server 特定的，但应该可以帮助您在 Postgres SQL 中创建等效代码。

\n\n

CREATE PROCEDURE dbo.RebuildNestedSets AS\n/****************************************************************************\n Purpose:\n Rebuilds a "Hierarchy" table that contains the original Adjacency List,\n the Nested Sets version of the same hierarchy, and several other useful \n columns of data some of which need not be included in the final table.\n\n Usage:\n EXEC dbo.RebuildNestedSets\n\n Progammer\'s Notes:\n 1. As currently written, the code reads from a table called dbo.Employee.\n 2. The Employee table must contain well indexed EmployeeID (child) and\n    ManagerID (parent) columns.\n 3. The Employee table must be a "well formed" Adjacency List. That is, the\n    EmployeeID column must be unique and there must be a foreign key on the\n    ManagerID column that points to the EmployeeID column. The table must not\n    contain any "cycles" (an EmployeeID in its own upline). The Root Node\n    must have a NULL for ManagerID.\n 4. The final table, named dbo.Hierarchy, will be created in the same \n    database as where this stored procedure is present.  IT DOES DROP THE \n    TABLE CALLED DBO.HIERARCHY SO BE CAREFUL THAT IT DOESN\'T DROP A TABLE \n    NEAR AND DEAR TO YOUR HEART.\n 5. This code currently has no ROLLBACK capabilities so make sure that you\n    have met all of the requirements (and, perhaps, more) cited in #3 above.\n\n Dependencies:\n 1. This stored procedure requires that the following special purpose HTally\n    table be present in the same database from which it runs.\n\n--===== Create the HTally table to be used for splitting SortPath\n SELECT TOP 1000 --(4 * 1000 = VARBINARY(4000) in length)\n        N = ISNULL(CAST(\n                (ROW_NUMBER() OVER (ORDER BY (SELECT NULL))-1)*4+1\n            AS INT),0)\n   INTO dbo.HTally\n   FROM master.sys.all_columns ac1\n  CROSS JOIN master.sys.all_columns ac2\n;\n--===== Add the quintessential PK for performance.\n  ALTER TABLE dbo.HTally\n    ADD CONSTRAINT PK_HTally \n        PRIMARY KEY CLUSTERED (N) WITH FILLFACTOR = 100\n;\n\n Revision History:\n Rev 00 - Circa 2009  - Jeff Moden \n        - Initial concept and creation.\n Rev 01 - PASS 2010   - Jeff Moden \n        - Rewritten for presentation at PASS 2010.\n Rev 02 - 06 Oct 2012 - Jeff Moden\n        - Code redacted to include a more efficient, higher performmance\n          method of splitting the SortPath using a custom HTally Table.\n****************************************************************************/\n--===========================================================================\n--      Presets\n--===========================================================================\n--===== Suppress the auto-display of rowcounts to prevent from returning\n     -- false errors if called from a GUI or other application.\n    SET NOCOUNT ON;\n\n--===== Start a duration timer\nDECLARE @StartTime DATETIME,\n        @Duration  CHAR(12);\n SELECT @StartTime = GETDATE();\n\n--===========================================================================\n--      1.  Read ALL the nodes in a given level as indicated by the parent/\n--          child relationship in the Adjacency List.\n--      2.  As we read the nodes in a given level, mark each node with the \n--          current level number.\n--      3.  As we read the nodes in a given level, convert the EmployeeID to\n--          a Binary(4) and concatenate it with the parents in the previous\n--          level\'s binary string of EmployeeID\'s.  This will build the \n--          SortPath.\n--      4.  Number the rows according to the Sort Path.  This will number the\n--          rows in the same order that the push-stack method would number \n--          them.\n--===========================================================================\n--===== Conditionally drop the final table to make reruns easier in SSMS.\n     IF OBJECT_ID(\'FK_Hierarchy_Hierarchy\') IS NOT NULL\n        ALTER TABLE dbo.Hierarchy\n         DROP CONSTRAINT FK_Hierarchy_Hierarchy;\n\n     IF OBJECT_ID(\'dbo.Hierarchy\',\'U\') IS NOT NULL\n         DROP TABLE dbo.Hierarchy;\n\nRAISERROR(\'Building the initial table and SortPath...\',0,1) WITH NOWAIT;\n--===== Build the new table on-the-fly including some place holders\n   WITH cteBuildPath AS \n( --=== This is the "anchor" part of the recursive CTE.\n     -- The only thing it does is load the Root Node.\n SELECT anchor.EmployeeID, \n        anchor.ManagerID, \n        HLevel   = 1,\n        SortPath =  CAST(\n                        CAST(anchor.EmployeeID AS BINARY(4)) \n                    AS VARBINARY(4000)) --Up to 1000 levels deep.\n   FROM dbo.Employee AS anchor\n  WHERE ManagerID IS NULL --Only the Root Node has a NULL ManagerID\n  UNION ALL \n --==== This is the "recursive" part of the CTE that adds 1 for each level\n     -- and concatenates each level of EmployeeID\'s to the SortPath column.  \n SELECT recur.EmployeeID, \n        recur.ManagerID, \n        HLevel   =  cte.HLevel + 1,\n        SortPath =  CAST( --This does the concatenation to build SortPath\n                        cte.SortPath + CAST(Recur.EmployeeID AS BINARY(4))\n                    AS VARBINARY(4000))\n   FROM dbo.Employee      AS recur WITH (TABLOCK)\n  INNER JOIN cteBuildPath AS cte \n          ON cte.EmployeeID = recur.ManagerID\n) --=== This final INSERT/SELECT creates the Node # in the same order as a\n     -- push-stack would. It also creates the final table with some\n     -- "reserved" columns on the fly. We\'ll leave the SortPath column in\n     -- place because we\'re still going to need it later.\n     -- The ISNULLs make NOT NULL columns\n SELECT EmployeeID = ISNULL(sorted.EmployeeID,0),\n        sorted.ManagerID,\n        HLevel     = ISNULL(sorted.HLevel,0),\n        LeftBower  = ISNULL(CAST(0 AS INT),0), --Place holder\n        RightBower = ISNULL(CAST(0 AS INT),0), --Place holder\n        NodeNumber = ROW_NUMBER() OVER (ORDER BY sorted.SortPath),\n        NodeCount  = ISNULL(CAST(0 AS INT),0), --Place holder\n        SortPath   = ISNULL(sorted.SortPath,sorted.SortPath)\n   INTO dbo.Hierarchy\n   FROM cteBuildPath AS sorted\n OPTION (MAXRECURSION 100) --Change this IF necessary\n;\nRAISERROR(\'There are %u rows in dbo.Hierarchy\',0,1,@@ROWCOUNT) WITH NOWAIT;\n\n--===== Display the cumulative duration\n SELECT @Duration = CONVERT(CHAR(12),GETDATE()-@StartTime,114);\nRAISERROR(\'Cumulative Duration = %s\',0,1,@Duration) WITH NOWAIT;\n\n--===========================================================================\n--      Using the information created in the table above, create the\n--      NodeCount column and the LeftBower and RightBower columns to create\n--      the Nested Sets hierarchical structure.\n--===========================================================================\nRAISERROR(\'Building the Nested Sets...\',0,1) WITH NOWAIT;\n\n--===== Declare a working variable to hold the result of the calculation\n     -- of the LeftBower so that it may be easily used to create the\n     -- RightBower in a single scan of the final table.\nDECLARE @LeftBower INT\n;\n--===== Create the Nested Sets from the information available in the table\n     -- and in the following CTE. This uses the proprietary form of UPDATE\n     -- available in SQL Serrver for extra performance.\n   WITH cteCountDownlines AS\n( --=== Count each occurance of EmployeeID in the sort path\n SELECT EmployeeID = CAST(SUBSTRING(h.SortPath,t.N,4) AS INT), \n        NodeCount  = COUNT(*) --Includes current node\n   FROM dbo.Hierarchy h, \n        dbo.HTally t\n  WHERE t.N BETWEEN 1 AND DATALENGTH(SortPath)\n  GROUP BY SUBSTRING(h.SortPath,t.N,4)\n) --=== Update the NodeCount and calculate both Bowers\n UPDATE h\n    SET @LeftBower   = LeftBower = 2 * NodeNumber - HLevel,\n        h.NodeCount  = downline.NodeCount,\n        h.RightBower = (downline.NodeCount - 1) * 2 + @LeftBower + 1\n   FROM dbo.Hierarchy h\n   JOIN cteCountDownlines downline\n     ON h.EmployeeID = downline.EmployeeID\n;\nRAISERROR(\'%u rows have been updated to Nested Sets\',0,1,@@ROWCOUNT)\nWITH NOWAIT;\n\nRAISERROR(\'If the rowcounts don\'\'t match, there may be orphans.\'\n,0,1,@@ROWCOUNT)WITH NOWAIT;\n\n--===== Display the cumulative duration\n SELECT @Duration = CONVERT(CHAR(12),GETDATE()-@StartTime,114);\nRAISERROR(\'Cumulative Duration = %s\',0,1,@Duration) WITH NOWAIT;\n\n--===========================================================================\n--      Prepare the table for high performance reads by adding indexes.\n--===========================================================================\nRAISERROR(\'Building the indexes...\',0,1) WITH NOWAIT;\n\n--===== Direct support for the Nested Sets\n  ALTER TABLE dbo.Hierarchy \n    ADD CONSTRAINT PK_Hierarchy\n        PRIMARY KEY CLUSTERED (LeftBower, RightBower) WITH FILLFACTOR = 100\n;\n CREATE UNIQUE INDEX AK_Hierarchy \n     ON dbo.Hierarchy (EmployeeID) WITH FILLFACTOR = 100\n;\n  ALTER TABLE dbo.Hierarchy\n    ADD CONSTRAINT FK_Hierarchy_Hierarchy FOREIGN KEY\n        (ManagerID) REFERENCES dbo.Hierarchy (EmployeeID) \n     ON UPDATE NO ACTION \n     ON DELETE NO ACTION\n;\n--===== Display the cumulative duration\n SELECT @Duration = CONVERT(CHAR(12),GETDATE()-@StartTime,114);\nRAISERROR(\'Cumulative Duration = %s\',0,1,@Duration) WITH NOWAIT;\n\n--===========================================================================\n--      Exit\n--===========================================================================\nRAISERROR(\'===============================================\',0,1) WITH NOWAIT;\nRAISERROR(\'RUN COMPLETE\',0,1) WITH NOWAIT;\nRAISERROR(\'===============================================\',0,1) WITH NOWAIT;\n

Run Code Online (Sandbox Code Playgroud)\n\n

第 2 部分涵盖了一些预先报告需求，例如不同深度级别的子层次结构的总计：

\n\n

--===== Start a "Timer" to see how long this all takes.\nDECLARE @StartTime DATETIME;\n SELECT @StartTime = GETDATE();\n\n--===========================================================================\n--      1.  Read ALL the nodes in a given level as indicated by the parent/\n--          child relationship in the Adjacency List.\n--      2.  As we read the nodes in a given level, mark each node with the \n--          current level number.\n--      3.  As we read the nodes in a given level, convert the EmployeeID to\n--          a Binary(4) and concatenate it with the parents in the previous\n--          level\xe2\x80\x99s binary string of EmployeeID\xe2\x80\x99s.  This will build the \n--          SortPath.\n--===========================================================================\n--===== Conditionally drop the work table to make reruns easier in SSMS.\n     IF OBJECT_ID(\'dbo.Hierarchy\',\'U\') IS NOT NULL\n         DROP TABLE dbo.Hierarchy;\n\n--===== Build the new table on-the-fly including some place holders\n   WITH cteBuildPath AS \n( --=== This is the "anchor" part of the recursive CTE.\n     -- The only thing it does is load the Root Node.\n SELECT anchor.EmployeeID, \n        anchor.ManagerID, \n        HLevel   = 1,\n        SortPath =  CAST(\n                        CAST(anchor.EmployeeID AS BINARY(4)) \n                    AS VARBINARY(4000)) --Up to 1000 levels deep.\n   FROM dbo.Employee AS anchor\n  WHERE ManagerID IS NULL --Only the Root Node has a NULL ManagerID\n  UNION ALL \n --==== This is the "recursive" part of the CTE that adds 1 for each level\n     -- and concatenates each level of EmployeeID\'s to the SortPath column.  \n SELECT recur.EmployeeID, \n        recur.ManagerID, \n        HLevel   =  cte.HLevel + 1,\n        SortPath =  CAST( --This does the concatenation to build SortPath\n                        cte.SortPath + CAST(Recur.EmployeeID AS BINARY(4))\n                    AS VARBINARY(4000))\n   FROM dbo.Employee      AS recur WITH (TABLOCK)\n  INNER JOIN cteBuildPath AS cte \n          ON cte.EmployeeID = recur.ManagerID\n) --=== This final INSERT/SELECT creates an iterim working table to hold the\n     -- original Adjacency List, the hierarchal level of each node, and the\n     -- SortPath which is the binary representation of each node\'s upline.\n     -- The ISNULLs make NOT NULL columns\n SELECT EmployeeID = ISNULL(sorted.EmployeeID,0),\n        sorted.ManagerID,\n        Sales      = ISNULL(CAST(0 AS BIGINT),0), --Place Holder\n        HLevel     = ISNULL(sorted.HLevel,0),\n        SortPath   = ISNULL(sorted.SortPath,sorted.SortPath)\n   INTO dbo.Hierarchy\n   FROM cteBuildPath AS sorted\n OPTION (MAXRECURSION 100) --Change this IF necessary\n;\n--===== You\'ll be tempted to add the following index because it seems so\n     -- logical a thing to do for performance, but DON\'T do it! It will\n     -- actually slow the rest of the code down by a factor of 2!!!!\n --ALTER TABLE dbo.Hierarchy\n --  ADD CONSTRAINT PK_Hierarchy PRIMARY KEY CLUSTERED (EmployeeID)\n--;\n--===== Populate the Hierarchy table with current Sales data.\n UPDATE h \n    SET h.Sales = s.Sales\n   FROM dbo.Hierarchy h\n  INNER JOIN dbo.CurrentMonthlySales s\n     ON h.EmployeeID = s.EmployeeID\n;\n--===== Conditionally drop the final table to make reruns easier in SSMS.\n     IF OBJECT_ID(\'dbo.PreAggregatedHierarchy,\'U\') IS NOT NULL\n        DROP TABLE dbo.PreAggregatedHierarchy\n;\n--===== Now, build "Everything" into the PreAggregatedHierarchy table.\nWITH\ncteSplit AS\n(--==== Splits the path into elements (including Sales and HLevel) \n     -- so that we can aggregate them by EmployeeID and HLevel.\n     -- Can\'t aggregate here without including the SortPath so we don\'t.\n SELECT EmployeeID = CAST(SUBSTRING(h.SortPath,t.N,4) AS INT),\n        h.HLevel, h.Sales\n   FROM dbo.HTally         AS t\n  CROSS JOIN dbo.Hierarchy AS h\n  WHERE t.N BETWEEN 1 AND DATALENGTH(SortPath)\n),\ncteAggregate AS\n(--==== Creates the aggregates and introduces the "Relative Level" column.\n     -- NodeCount = Count of nodes in downline for each EmployeeID by Level\n     -- Sales = Total Sales in downline for each EmployeeID by Level.\n SELECT EmployeeID,\n        HLevel,\n        RLevel    = ROW_NUMBER() OVER (PARTITION BY EmployeeID \n                                           ORDER BY EmployeeID, HLevel),\n        NodeCount = COUNT(*),\n        Sales     = SUM(CAST(Sales AS MONEY))\n   FROM cteSplit\n  GROUP BY EmployeeID, HLevel\n)\n--===== Adds a "Rollup" to create all the subtotals that we need.\n     -- We couldn\'t do this in the previous step because we didn\'t know what\n     -- the "Relative Level" was for each row, yet.\n     -- The HAVING eliminates unnecessary subtotals that are created.\n SELECT EmployeeID = ISNULL(a.EmployeeID,0), --Convert NULL total lines to 0\n        HLevel     = MIN(a.HLevel), --Just so we don\'t have to GROUP BY\n        RLevel     = ISNULL(CAST(a.RLevel AS TINYINT),0),\n        NodeCount  = SUM(a.NodeCount), --Just so we don\'t have to GROUP BY\n        Sales      = SUM(a.Sales) --Just so we don\'t have to GROUP BY\n   INTO dbo.PreAggregatedHierarchy\n   FROM cteAggregate a\n  GROUP BY EmployeeID, RLevel WITH ROLLUP\n HAVING EmployeeID > 0 --Eliminates the NULL total lines for cleaner output\n;\n--===== Add the Clustered Index as a Primary Key\n  ALTER TABLE dbo.PreAggregatedHierarchy\n    ADD CONSTRAINT PK_PreAggregatedHierarchy \n        PRIMARY KEY CLUSTERED (EmployeeID, RLevel) WITH FILLFACTOR = 100\n;\n--===== Display how long it all took\n  PRINT \'Duration: \' + CONVERT(CHAR(12),GETDATE()-@StartTime,114) + \' (hh:mi:ss:mmm)\';\n

Run Code Online (Sandbox Code Playgroud)\n\n

您计算简单下线计数的特定问题简化为简单的算术计算，Jeff 将嵌套集表中每个节点的Left Bower和Right Bower称为如下：

\n\n

Children = (RightBower - 1 - LeftBower) / 2\n

Run Code Online (Sandbox Code Playgroud)\n

归档时间：	11 年，4 月前
查看次数：	1239 次
最近记录：	8 年，2 月前