注意:此问题已更新以反映我们目前正在使用 MySQL,这样做后,我想看看如果我们切换到支持 CTE 的数据库会更容易。
我有一个带有主键id和外键的自引用表parent_id。
+------------+--------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+------------+--------------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| parent_id | int(11) | YES | | NULL | |
| name | varchar(255) | YES | | NULL | |
| notes | text | YES | | NULL | |
+------------+--------------+------+-----+---------+----------------+
Run Code Online (Sandbox Code Playgroud)
给定 a name,如何查询顶级父级?
给定 a name,如何查询id与 的记录相关联的所有name = 'foo'?
上下文:我不是 dba,但打算请 dba 实现这种类型的层次结构,并想测试一些查询。Kattge 等人在 2011 年描述了这样做的动机。
以下是表中 id 之间关系的示例:

-- -----------------------------------------------------
-- Create a new database called 'testdb'
-- -----------------------------------------------------
SET @OLD_UNIQUE_CHECKS=@@UNIQUE_CHECKS, UNIQUE_CHECKS=0;
SET @OLD_FOREIGN_KEY_CHECKS=@@FOREIGN_KEY_CHECKS, FOREIGN_KEY_CHECKS=0;
SET @OLD_SQL_MODE=@@SQL_MODE, SQL_MODE='TRADITIONAL';
CREATE SCHEMA IF NOT EXISTS `testdb` DEFAULT CHARACTER SET latin1 COLLATE latin1_swedish_ci ;
USE `testdb` ;
-- -----------------------------------------------------
-- Table `testdb`.`observations`
-- -----------------------------------------------------
CREATE TABLE IF NOT EXISTS `testdb`.`observations` (
`id` INT NOT NULL ,
`parent_id` INT NULL ,
`name` VARCHAR(45) NULL ,
PRIMARY KEY (`id`) )
ENGINE = InnoDB;
SET SQL_MODE=@OLD_SQL_MODE;
SET FOREIGN_KEY_CHECKS=@OLD_FOREIGN_KEY_CHECKS;
SET UNIQUE_CHECKS=@OLD_UNIQUE_CHECKS;
-- -----------------------------------------------------
-- Add Example Data Set
-- -----------------------------------------------------
INSERT INTO observations VALUES (1,3), (2,5), (3,NULL), (4,10),
(5,NULL), (6,1), (7,5), (8,10), (9,10), (10,3);
Run Code Online (Sandbox Code Playgroud)
Rol*_*DBA 70
您绝对必须通过 MySQL 存储过程语言编写此脚本
这是一个存储函数,GetParentIDByID用于检索给定要搜索的 ID 的 ParentID
DELIMITER $$
DROP FUNCTION IF EXISTS `junk`.`GetParentIDByID` $$
CREATE FUNCTION `junk`.`GetParentIDByID` (GivenID INT) RETURNS INT
DETERMINISTIC
BEGIN
DECLARE rv INT;
SELECT IFNULL(parent_id,-1) INTO rv FROM
(SELECT parent_id FROM pctable WHERE id = GivenID) A;
RETURN rv;
END $$
DELIMITER ;
Run Code Online (Sandbox Code Playgroud)
这是一个调用的存储函数,GetAncestry用于检索从第一代开始的 ParentID 列表,在给定一个 ID 的层次结构中开始:
DELIMITER $$
DROP FUNCTION IF EXISTS `junk`.`GetAncestry` $$
CREATE FUNCTION `junk`.`GetAncestry` (GivenID INT) RETURNS VARCHAR(1024)
DETERMINISTIC
BEGIN
DECLARE rv VARCHAR(1024);
DECLARE cm CHAR(1);
DECLARE ch INT;
SET rv = '';
SET cm = '';
SET ch = GivenID;
WHILE ch > 0 DO
SELECT IFNULL(parent_id,-1) INTO ch FROM
(SELECT parent_id FROM pctable WHERE id = ch) A;
IF ch > 0 THEN
SET rv = CONCAT(rv,cm,ch);
SET cm = ',';
END IF;
END WHILE;
RETURN rv;
END $$
DELIMITER ;
Run Code Online (Sandbox Code Playgroud)
这是生成示例数据的内容:
USE junk
DROP TABLE IF EXISTS pctable;
CREATE TABLE pctable
(
id INT NOT NULL AUTO_INCREMENT,
parent_id INT,
PRIMARY KEY (id)
) ENGINE=MyISAM;
INSERT INTO pctable (parent_id) VALUES (0);
INSERT INTO pctable (parent_id) SELECT parent_id+1 FROM pctable;
INSERT INTO pctable (parent_id) SELECT parent_id+2 FROM pctable;
INSERT INTO pctable (parent_id) SELECT parent_id+3 FROM pctable;
INSERT INTO pctable (parent_id) SELECT parent_id+4 FROM pctable;
INSERT INTO pctable (parent_id) SELECT parent_id+5 FROM pctable;
SELECT * FROM pctable;
Run Code Online (Sandbox Code Playgroud)
这是它生成的内容:
mysql> USE junk
Database changed
mysql> DROP TABLE IF EXISTS pctable;
Query OK, 0 rows affected (0.00 sec)
mysql> CREATE TABLE pctable
-> (
-> id INT NOT NULL AUTO_INCREMENT,
-> parent_id INT,
-> PRIMARY KEY (id)
-> ) ENGINE=MyISAM;
Query OK, 0 rows affected (0.05 sec)
mysql> INSERT INTO pctable (parent_id) VALUES (0);
Query OK, 1 row affected (0.00 sec)
mysql> INSERT INTO pctable (parent_id) SELECT parent_id+1 FROM pctable;
Query OK, 1 row affected (0.00 sec)
Records: 1 Duplicates: 0 Warnings: 0
mysql> INSERT INTO pctable (parent_id) SELECT parent_id+2 FROM pctable;
Query OK, 2 rows affected (0.00 sec)
Records: 2 Duplicates: 0 Warnings: 0
mysql> INSERT INTO pctable (parent_id) SELECT parent_id+3 FROM pctable;
Query OK, 4 rows affected (0.00 sec)
Records: 4 Duplicates: 0 Warnings: 0
mysql> INSERT INTO pctable (parent_id) SELECT parent_id+4 FROM pctable;
Query OK, 8 rows affected (0.01 sec)
Records: 8 Duplicates: 0 Warnings: 0
mysql> INSERT INTO pctable (parent_id) SELECT parent_id+5 FROM pctable;
Query OK, 16 rows affected (0.00 sec)
Records: 16 Duplicates: 0 Warnings: 0
mysql> SELECT * FROM pctable;
+----+-----------+
| id | parent_id |
+----+-----------+
| 1 | 0 |
| 2 | 1 |
| 3 | 2 |
| 4 | 3 |
| 5 | 3 |
| 6 | 4 |
| 7 | 5 |
| 8 | 6 |
| 9 | 4 |
| 10 | 5 |
| 11 | 6 |
| 12 | 7 |
| 13 | 7 |
| 14 | 8 |
| 15 | 9 |
| 16 | 10 |
| 17 | 5 |
| 18 | 6 |
| 19 | 7 |
| 20 | 8 |
| 21 | 8 |
| 22 | 9 |
| 23 | 10 |
| 24 | 11 |
| 25 | 9 |
| 26 | 10 |
| 27 | 11 |
| 28 | 12 |
| 29 | 12 |
| 30 | 13 |
| 31 | 14 |
| 32 | 15 |
+----+-----------+
32 rows in set (0.00 sec)
Run Code Online (Sandbox Code Playgroud)
以下是函数为每个值生成的内容:
mysql> SELECT id,GetParentIDByID(id),GetAncestry(id) FROM pctable;
+----+---------------------+-----------------+
| id | GetParentIDByID(id) | GetAncestry(id) |
+----+---------------------+-----------------+
| 1 | 0 | |
| 2 | 1 | 1 |
| 3 | 2 | 2,1 |
| 4 | 3 | 3,2,1 |
| 5 | 3 | 3,2,1 |
| 6 | 4 | 4,3,2,1 |
| 7 | 5 | 5,3,2,1 |
| 8 | 6 | 6,4,3,2,1 |
| 9 | 4 | 4,3,2,1 |
| 10 | 5 | 5,3,2,1 |
| 11 | 6 | 6,4,3,2,1 |
| 12 | 7 | 7,5,3,2,1 |
| 13 | 7 | 7,5,3,2,1 |
| 14 | 8 | 8,6,4,3,2,1 |
| 15 | 9 | 9,4,3,2,1 |
| 16 | 10 | 10,5,3,2,1 |
| 17 | 5 | 5,3,2,1 |
| 18 | 6 | 6,4,3,2,1 |
| 19 | 7 | 7,5,3,2,1 |
| 20 | 8 | 8,6,4,3,2,1 |
| 21 | 8 | 8,6,4,3,2,1 |
| 22 | 9 | 9,4,3,2,1 |
| 23 | 10 | 10,5,3,2,1 |
| 24 | 11 | 11,6,4,3,2,1 |
| 25 | 9 | 9,4,3,2,1 |
| 26 | 10 | 10,5,3,2,1 |
| 27 | 11 | 11,6,4,3,2,1 |
| 28 | 12 | 12,7,5,3,2,1 |
| 29 | 12 | 12,7,5,3,2,1 |
| 30 | 13 | 13,7,5,3,2,1 |
| 31 | 14 | 14,8,6,4,3,2,1 |
| 32 | 15 | 15,9,4,3,2,1 |
+----+---------------------+-----------------+
32 rows in set (0.02 sec)
Run Code Online (Sandbox Code Playgroud)
故事寓意:递归数据检索必须在 MySQL 中编写脚本
这是 GetAncestry 的反面。我称之为 GetFamilyTree。
这是算法:
我相信从我在大学的数据结构和算法课程中,这被称为预序/前缀树遍历之类的东西。
这是代码:
DELIMITER $$
DROP FUNCTION IF EXISTS `junk`.`GetFamilyTree` $$
CREATE FUNCTION `junk`.`GetFamilyTree` (GivenID INT) RETURNS varchar(1024) CHARSET latin1
DETERMINISTIC
BEGIN
DECLARE rv,q,queue,queue_children VARCHAR(1024);
DECLARE queue_length,front_id,pos INT;
SET rv = '';
SET queue = GivenID;
SET queue_length = 1;
WHILE queue_length > 0 DO
SET front_id = FORMAT(queue,0);
IF queue_length = 1 THEN
SET queue = '';
ELSE
SET pos = LOCATE(',',queue) + 1;
SET q = SUBSTR(queue,pos);
SET queue = q;
END IF;
SET queue_length = queue_length - 1;
SELECT IFNULL(qc,'') INTO queue_children
FROM (SELECT GROUP_CONCAT(id) qc
FROM pctable WHERE parent_id = front_id) A;
IF LENGTH(queue_children) = 0 THEN
IF LENGTH(queue) = 0 THEN
SET queue_length = 0;
END IF;
ELSE
IF LENGTH(rv) = 0 THEN
SET rv = queue_children;
ELSE
SET rv = CONCAT(rv,',',queue_children);
END IF;
IF LENGTH(queue) = 0 THEN
SET queue = queue_children;
ELSE
SET queue = CONCAT(queue,',',queue_children);
END IF;
SET queue_length = LENGTH(queue) - LENGTH(REPLACE(queue,',','')) + 1;
END IF;
END WHILE;
RETURN rv;
END $$
Run Code Online (Sandbox Code Playgroud)
这是每行产生的内容
mysql> SELECT id,parent_id,GetParentIDByID(id),GetAncestry(id),GetFamilyTree(id) FROM pctable;
+----+-----------+---------------------+-----------------+--------------------------------------------------------------------------------------+
| id | parent_id | GetParentIDByID(id) | GetAncestry(id) | GetFamilyTree(id) |
+----+-----------+---------------------+-----------------+--------------------------------------------------------------------------------------+
| 1 | 0 | 0 | | 2,3,4,5,6,9,7,10,17,8,11,18,15,22,25,12,13,19,16,23,26,14,20,21,24,27,32,28,29,30,31 |
| 2 | 1 | 1 | 1 | 3,4,5,6,9,7,10,17,8,11,18,15,22,25,12,13,19,16,23,26,14,20,21,24,27,32,28,29,30,31 |
| 3 | 2 | 2 | 2,1 | 4,5,6,9,7,10,17,8,11,18,15,22,25,12,13,19,16,23,26,14,20,21,24,27,32,28,29,30,31 |
| 4 | 3 | 3 | 3,2,1 | 6,9,8,11,18,15,22,25,14,20,21,24,27,32,31 |
| 5 | 3 | 3 | 3,2,1 | 7,10,17,12,13,19,16,23,26,28,29,30 |
| 6 | 4 | 4 | 4,3,2,1 | 8,11,18,14,20,21,24,27,31 |
| 7 | 5 | 5 | 5,3,2,1 | 12,13,19,28,29,30 |
| 8 | 6 | 6 | 6,4,3,2,1 | 14,20,21,31 |
| 9 | 4 | 4 | 4,3,2,1 | 15,22,25,32 |
| 10 | 5 | 5 | 5,3,2,1 | 16,23,26 |
| 11 | 6 | 6 | 6,4,3,2,1 | 24,27 |
| 12 | 7 | 7 | 7,5,3,2,1 | 28,29 |
| 13 | 7 | 7 | 7,5,3,2,1 | 30 |
| 14 | 8 | 8 | 8,6,4,3,2,1 | 31 |
| 15 | 9 | 9 | 9,4,3,2,1 | 32 |
| 16 | 10 | 10 | 10,5,3,2,1 | |
| 17 | 5 | 5 | 5,3,2,1 | |
| 18 | 6 | 6 | 6,4,3,2,1 | |
| 19 | 7 | 7 | 7,5,3,2,1 | |
| 20 | 8 | 8 | 8,6,4,3,2,1 | |
| 21 | 8 | 8 | 8,6,4,3,2,1 | |
| 22 | 9 | 9 | 9,4,3,2,1 | |
| 23 | 10 | 10 | 10,5,3,2,1 | |
| 24 | 11 | 11 | 11,6,4,3,2,1 | |
| 25 | 9 | 9 | 9,4,3,2,1 | |
| 26 | 10 | 10 | 10,5,3,2,1 | |
| 27 | 11 | 11 | 11,6,4,3,2,1 | |
| 28 | 12 | 12 | 12,7,5,3,2,1 | |
| 29 | 12 | 12 | 12,7,5,3,2,1 | |
| 30 | 13 | 13 | 13,7,5,3,2,1 | |
| 31 | 14 | 14 | 14,8,6,4,3,2,1 | |
| 32 | 15 | 15 | 15,9,4,3,2,1 | |
+----+-----------+---------------------+-----------------+--------------------------------------------------------------------------------------+
32 rows in set (0.04 sec)
Run Code Online (Sandbox Code Playgroud)
如果没有循环路径,该算法可以正常工作。如果存在任何循环路径,则必须向表中添加“已访问”列。
添加访问列后,这里是阻止循环关系的算法:
我创建了一个名为观察的新表并填充了您的示例数据。我将存储过程更改为使用观察而不是 pctable。这是您的输出:
mysql> CREATE TABLE observations LIKE pctable;
Query OK, 0 rows affected (0.04 sec)
mysql> INSERT INTO observations VALUES (1,3), (2,5), (3,0), (4,10),(5,0),(6,1),(7,5),(8,10),(9,10),(10,3);
Query OK, 10 rows affected (0.00 sec)
Records: 10 Duplicates: 0 Warnings: 0
mysql> SELECT * FROM observations;
+----+-----------+
| id | parent_id |
+----+-----------+
| 1 | 3 |
| 2 | 5 |
| 3 | 0 |
| 4 | 10 |
| 5 | 0 |
| 6 | 1 |
| 7 | 5 |
| 8 | 10 |
| 9 | 10 |
| 10 | 3 |
+----+-----------+
10 rows in set (0.00 sec)
mysql> SELECT id,parent_id,GetParentIDByID(id),GetAncestry(id),GetFamilyTree(id) FROM observations;
+----+-----------+---------------------+-----------------+-------------------+
| id | parent_id | GetParentIDByID(id) | GetAncestry(id) | GetFamilyTree(id) |
+----+-----------+---------------------+-----------------+-------------------+
| 1 | 3 | 3 | | 6 |
| 2 | 5 | 5 | 5 | |
| 3 | 0 | 0 | | 1,10,6,4,8,9 |
| 4 | 10 | 10 | 10,3 | |
| 5 | 0 | 0 | | 2,7 |
| 6 | 1 | 1 | 1 | |
| 7 | 5 | 5 | 5 | |
| 8 | 10 | 10 | 10,3 | |
| 9 | 10 | 10 | 10,3 | |
| 10 | 3 | 3 | 3 | 4,8,9 |
+----+-----------+---------------------+-----------------+-------------------+
10 rows in set (0.01 sec)
Run Code Online (Sandbox Code Playgroud)
我更改了 GetAncestry 的代码。有WHILE ch > 1它应该是WHILE ch > 0
mysql> SELECT id,parent_id,GetParentIDByID(id),GetAncestry(id),GetFamilyTree(id) FROM observations;
+----+-----------+---------------------+-----------------+-------------------+
| id | parent_id | GetParentIDByID(id) | GetAncestry(id) | GetFamilyTree(id) |
+----+-----------+---------------------+-----------------+-------------------+
| 1 | 3 | 3 | 3 | 6 |
| 2 | 5 | 5 | 5 | |
| 3 | 0 | 0 | | 1,10,6,4,8,9 |
| 4 | 10 | 10 | 10,3 | |
| 5 | 0 | 0 | | 2,7 |
| 6 | 1 | 1 | 1,3 | |
| 7 | 5 | 5 | 5 | |
| 8 | 10 | 10 | 10,3 | |
| 9 | 10 | 10 | 10,3 | |
| 10 | 3 | 3 | 3 | 4,8,9 |
+----+-----------+---------------------+-----------------+-------------------+
10 rows in set (0.01 sec)
Run Code Online (Sandbox Code Playgroud)
现在就试试 !!!
a_h*_*ame 30
获取指定节点的所有父节点:
WITH RECURSIVE tree AS (
SELECT id,
name,
parent_id,
1 as level
FROM the_table
WHERE name = 'foo'
UNION ALL
SELECT p.id,
p.name,
p.parent_id,
t.level + 1
FROM the_table p
JOIN tree t ON t.parent_id = p.id
)
SELECT *
FROM tree
Run Code Online (Sandbox Code Playgroud)
要获取根节点,您可以例如ORDER BY level取第一行
获取指定节点的所有子节点:
WITH RECURSIVE tree AS (
SELECT id,
name,
parent_id,
1 as level
FROM the_table
WHERE name = 'foo'
UNION ALL
SELECT p.id,
p.name,
p.parent_id,
t.level + 1
FROM your_table p
JOIN tree t ON t.id = p.parent_id
)
SELECT *
FROM tree
Run Code Online (Sandbox Code Playgroud)
(注意语句递归部分中连接的交换条件)
据我所知,以下 DBMS 支持递归 CTE:
编辑
根据您的示例数据,以下内容将从表中检索所有子树,包括每个节点的完整路径作为附加列:
with recursive obs_tree as (
select id, parent_id, '/'||cast(id as varchar) as tree
from observations
where parent_id is null
union all
select t.id, t.parent_id, p.tree||'/'||cast(t.id as varchar)
from observations t
join obs_tree p on t.parent_id = p.id
)
select id, parent_id, tree
from obs_tree
order by tree
Run Code Online (Sandbox Code Playgroud)
输出将是这样的:
身份证 | parent_id | 树 ----+-----------+--------- 3 | | /3 1 | 3 | /3/1 6 | 1 | /3/1/6 10 | 3 | /3/10 4 | 10 | /3/10/4 8 | 10 | /3/10/8 9 | 10 | /3/10/9 5 | | /5 2 | 5 | /5/2 7 | 5 | /5/7
小智 8
当给定的 id 大于 4 个整数时,Rolando 的答案中的 GetFamilyTree 函数不起作用,因为 FORMAT MySQL 函数为千位分隔符添加了逗号。我修改了存储函数 GetFamilyTree 以使用大整数 ID,如下所示:
WHILE queue_length > 0 DO
IF queue_length = 1 THEN
SET front_id = queue;
SET queue = '';
ELSE
SET front_id = SUBSTR(queue,1,LOCATE(',',queue)-1);
SET pos = LOCATE(',',queue) + 1;
SET q = SUBSTR(queue,pos);
SET queue = q;
END IF;
Run Code Online (Sandbox Code Playgroud)
front_id 在 if else 循环中定义。