为什么CTE计算在查询计划中重复,如何在不重复代码的情况下进行优化?

alp*_*pav 3 sql sql-server performance common-table-expression sql-execution-plan

在本次查询的查询计划中,对 grp_set 的计算重复了 4 次(distinct sort 每次占用 23%,所以占用了所有资源的 23 * 4 = 92%):

with
     grp_set as (select distinct old_num,old_tbl,old_db,old_val_num from err_calc)
    ,grp as (select id = row_number() over (order by old_num),* from grp_set)

    ,leaf as (select grp.id ,c.* ,sort = convert(varchar(max),old_col) + " - " + severity + " - " + err
        from grp
        join err_calc c on
                            c.old_num   = grp.old_num
                        and c.old_tbl       = grp.old_tbl
                        and c.old_db        = grp.old_db
                        and c.old_val_num   = grp.old_val_num
    )

    select old_num,old_tbl,old_db,old_val_num,conc.*
        from (select sep=",") sep
        cross join grp
        cross apply (select
             old_col    = stuff((select sep + old_col   from leaf where leaf.id = grp.id order by leaf.sort FOR XML PATH("")),1,len(sep),"")
            ,old_val    = stuff((select sep + old_val   from leaf where leaf.id = grp.id order by leaf.sort FOR XML PATH("")),1,len(sep),"")
            ,severity   = stuff((select sep + severity  from leaf where leaf.id = grp.id order by leaf.sort FOR XML PATH("")),1,len(sep),"")
            ,err        = stuff((select sep + err       from leaf where leaf.id = grp.id order by leaf.sort FOR XML PATH("")),1,len(sep),"")
        ) conc
Run Code Online (Sandbox Code Playgroud)

表 err_calc 包含大约 350K 条记录,它只有一个索引为 old_db,old_tbl,new_tbl,severity,err,old_col,new_col,old_val_num,old_val,old_num,new_num。

由于 SQL 中缺少连接聚合,此查询的目的是连接每组 4 个字符串字段。

如果连接聚合存在或使用 CLR 实现,并且 order by 可以应用于聚合源,并且所有分组字段都可以引用,则等效和所需的查询grouping.*将是:

select grouping.*
    ,severity   =conc(sep+severity)
    ,err        =conc(sep+err)
    ,old_col    =conc(sep+old_col)
    ,old_val    =conc(sep+old_val)
    from err_calc
    cross join (select sep=',') sep
    group by old_num,old_tbl,old_db,old_val_num
    order by old_col,severity,err
Run Code Online (Sandbox Code Playgroud)

rud*_*hez 5

因为它像子查询一样使用,并且多次使用。参见 在同一个查询中多次调用 CTE

你应该JOIN用你的CTE而不是a重写你的查询CROSS APPLY,并将字符串连接的逻辑放在SELECT你的查询部分,然后CTE将被调用一次。

  • @alpav:它确实优化了它。如果在您的情况下实现 cte 更便宜,Postgres 会这样做,您会在查询计划中看到实现步骤。 (2认同)
  • 没有测试数据很难做到。您介意在 http://sqlfiddle.com/ 上发布一些数据吗? (2认同)