递归CTE如何逐行运行?

jus*_*n w 38 sql recursive-query

我认为我已经将Recursive CTE的格式写得很好,但是仍然发现自己很沮丧,我无法手动处理一个(假装自己是SQL引擎并用笔和纸到达结果集) .我发现了这个,这接近我正在寻找的,但不够详细.我没有问题跟踪C++递归函数并了解它是如何运行的 - 但对于SQL我不明白为什么或如何引擎知道停止.每次调用锚点和递归块,还是在以后的迭代中跳过锚点?(我对此表示怀疑,但我试图表达我对它似乎跳转的方式的困惑.)如果每次调用锚点,锚点在最终结果中不会多次出现?我希望有人能够分解第1行第2行等等.结果集累积时会发生什么以及"在内存中".

我冒昧地从这个页面窃取我的例子,因为它似乎是最容易理解的.

DECLARE @tbl TABLE ( 
      Id INT 
    , [Name] VARCHAR(20) 
    , ParentId INT 
) 

INSERT INTO @tbl( Id, Name, ParentId ) 
VALUES 
     (1, 'Europe', NULL) 
    ,(2, 'Asia',   NULL) 
    ,(3, 'Germany',   1) 
    ,(4, 'UK',        1) 
    ,(5, 'China',     2) 
    ,(6, 'India',     2) 
    ,(7, 'Scotland',  4) 
    ,(8, 'Edinburgh', 7) 
    ,(9, 'Leith',     8)  
; 

WITH abcd 
    AS ( 
        -- anchor 
        SELECT  id, Name, ParentID, 
                CAST(Name AS VARCHAR(1000)) AS Path 
        FROM    @tbl 
        WHERE   ParentId IS NULL 
        UNION ALL 
          --recursive member 
        SELECT  t.id, t.Name, t.ParentID, 
                CAST((a.path + '/' + t.Name) AS VARCHAR(1000)) AS "Path"
        FROM    @tbl AS t 
                JOIN abcd AS a 
                  ON t.ParentId = a.id 
       )
SELECT * FROM abcd 
Run Code Online (Sandbox Code Playgroud)

Qua*_*noi 35

想想CTE一个无穷无尽的递归UNION ALL:

WITH    rows AS
        (
        SELECT  *
        FROM    mytable
        WHERE   anchor_condition
        ),
        rows2 AS
        (
        SELECT  *
        FROM    set_operation(mytable, rows)
        ),
        rows3 AS
        (
        SELECT  *
        FROM    set_operation(mytable, rows2)
        ),
        …
SELECT  *
FROM    rows
UNION ALL
SELECT  *
FROM    rows2
UNION ALL
SELECT  *
FROM    rows3
UNION ALL
…
Run Code Online (Sandbox Code Playgroud)

在你的情况下,那将是:

WITH    abcd1 AS
        ( 
        SELECT  *
        FROM    @tbl t
        WHERE   ParentId IS NULL 
        ),
        abcd2 AS
        ( 
        SELECT  t.*
        FROM    abcd1
        JOIN    @tbl t
        ON      t.ParentID = abcd1.id
        ),
        abcd3 AS
        ( 
        SELECT  t.*
        FROM    abcd2
        JOIN    @tbl t
        ON      t.ParentID = abcd2.id
        ),
        abcd4 AS
        ( 
        SELECT  t.*
        FROM    abcd3
        JOIN    @tbl t
        ON      t.ParentID = abcd3.id
        ),
        abcd5 AS
        ( 
        SELECT  t.*
        FROM    abcd4
        JOIN    @tbl t
        ON      t.ParentID = abcd4.id
        ),
        abcd6 AS
        ( 
        SELECT  t.*
        FROM    abcd5
        JOIN    @tbl t
        ON      t.ParentID = abcd5.id
        )
SELECT  *
FROM    abcd1
UNION ALL
SELECT  *
FROM    abcd2
UNION ALL
SELECT  *
FROM    abcd3
UNION ALL
SELECT  *
FROM    abcd4
UNION ALL
SELECT  *
FROM    abcd5
UNION ALL
SELECT  *
FROM    abcd6
Run Code Online (Sandbox Code Playgroud)

由于abcd6产量没有结果,这意味着停止条件.

从理论上讲,递归CTE可以是无限的,但实际上,它SQL Server试图禁止导致无限记录集的查询.

您可能想阅读这篇文章:


Jus*_*ner 34

CTE使用的算法是:

  1. 执行锚点部分,得到结果r0
  2. 使用r0作为输入执行递归部分,并获得结果r1(非null)
  3. 使用r1作为输入执行递归部分,并获得结果r2(非null)
  4. 使用r3作为输入执行递归部分,并获得结果r3(非null)...
  5. 使用r(n-1)作为输入执行递归部分,并输出rn(null).这个时间rn为null,所以我们使用UNION ALL来组合r0,r1,r2 ... r(n-1),这就是最终的结果

让我们举一个例子:

WITH    cte ( value )
          AS (
               SELECT   1
               UNION ALL
               SELECT   value + 1
               FROM     cte
               WHERE    value < 4
             )
    SELECT  *
    FROM    cte
Run Code Online (Sandbox Code Playgroud)

此查询的结果是:

value
-----------
1
2
3
4

(4 row(s) affected)
Run Code Online (Sandbox Code Playgroud)

让我们一步一步地检查它:

Execute anchor query (SELECT 1), we got:
  r0 = 1
  cte = r0 = 1

    |
    |
    V

Now we execute
SELECT value + 1 FROM cte WHERE value < 4
Since cte is r0 (only has 1), we got:
  r1 = 2
  cte = r1 = 2

    |
    |
    V

Now we execute
SELECT value + 1 FROM cte WHERE value < 4
Since cte is r1 (only has 2), we got:
  r2 = 3
  cte = r2 = 3

    |
    |
    V

Now we execute
SELECT value + 1 FROM cte WHERE value < 4
Since cte is r2 (only has 3), we got:
  r3 = 4
  cte = r3 = 4

    |
    |
    V

Now we execute
SELECT value + 1 FROM cte WHERE value < 4
Since cte is r3 (only has 4), we got:
  r4 = NULL (because r3 (4) is equal to 4, not less than 4)
Now we stop the recursion!

    |
    |
    V

Let's calculate the final result:
R = r0 union all
    r1 union all
    r2 union all
    r3 union all
  = 1 union all
    2 union all
    3 union all
    4 union all
  = 1
    2
    3
    4
Run Code Online (Sandbox Code Playgroud)

  • 这说明它不是保持运行集,而是在每个递归级别获得单独的结果集.所以它不是R1 = 1,那么R2 =(1联合所有R1 + 1)=(1,2),然后R3 =(1联合所有R2)=(1,1 + 1,2 + 1)=(1 ,2,3)等.相反,它只使用前一个结果集查看结果并过滤而不是整个联合.它只会在最后将它们统一起来.换句话说,在仅使用递归成员的输出作为输入,而不是与递归成员联合的锚的输出. (4认同)

wom*_*omp 7

我认为它像这样崩溃:

  1. 执行锚语句.这将为您提供一组结果,称为基本集或T0.

  2. 执行递归语句,使用T0作为执行查询的表.查询CTE时会自动发生这种情况.

  3. 如果递归成员返回一些结果,则会创建一个新集合T1.然后使用T1作为输入再次执行递归成员,如果有任何结果则创建T2.

  4. 继续执行步骤3,直到不再生成结果,或者满足最大递归次数,由MAX_RECURSION选项设置.

这个页面可能最好地解释了它.它有一个CTE执行路径的逐步演练.