查询(连接)优化所需的帮助

imb*_*a22 2 performance sql-server optimization query-performance

我正在使用事实表源查询,我观察到查询的性能很糟糕。只需在 select 子句中使用一个转换日期格式的函数,它就从 1:00 分钟增加到 6:30 分钟。它只有 7 个表在简单的条件下连接(没有疯狂的东西)。

展望未来,我需要向连接列表添加更多的表。这只会使性能变得更糟。在开始添加之前,我需要对当前查询进行微调。

这是查询:

 SELECT [dbo].[dFK](oew.StartDate) AS StartDate, -- INTEGER DATE!
    [dbo].[dFK](oew.EndDate) AS EndDate,
    [dbo].[dFK](oew.EffectiveDate) AS EffectiveDate
FROM    OpenEnrollmentWindow oew
    INNER JOIN ProductYear py ON oew.OrganizationProductYearID = py.ID
    INNER JOIN Marketplace m ON py.MarketplaceID = m.ID
    INNER JOIN Organization o ON m.OrganizationID = o.ID
    INNER JOIN Consumer c ON c.OrganizationID = o.ID
    LEFT JOIN OpenEnrollmentWindowProduct oewp ON oew.ID = oewp.OrganizationOpenEnrollmentWindowID
    LEFT JOIN OpenEnrollmentWindowProductType oewpt ON oew.ID = oewpt.OrganizationOpenEnrollmentWindowID
Run Code Online (Sandbox Code Playgroud)

在此处输入图片说明

下面是函数的定义:

CREATE FUNCTION [dbo].[dFK]
(@dt as sql_variant)
RETURNS int
AS
BEGIN
    DECLARE @type varchar(128)  
    DECLARE @iDate int
    SET @type = CONVERT(varchar(128), SQL_VARIANT_PROPERTY(@dt, 'BaseType'))
    SET @iDate =
        CASE 
            WHEN @type = 'int' AND @dt >= 19000101 AND @dt <= 20451231 THEN CONVERT(int, @dt)
            WHEN @type = 'int' AND @dt < 19000101 OR @type = 'int' AND @dt > 20451231 THEN 1
            WHEN @dt IS NULL THEN 1
            WHEN (@dt < CAST('1900-01-01 00:00:00.000' AS DATETIME) OR @dt > CAST('2045-12-31 11:59:59.000' AS DATETIME)) AND @type = 'datetime' THEN 1
            WHEN (@dt < CAST('1900-01-01' AS DATE) OR @dt > CAST('2045-12-31' AS DATE)) AND @type = 'date' THEN 1
            ELSE FORMAT(CAST(@dt AS DATETIME2), 'yyyyMMdd')
        END
    RETURN @iDate
END

GO
Run Code Online (Sandbox Code Playgroud)

这用作事实表源。转换日期以避免对日期维度进行反向查找。假设它必须仅在服务器端转换。它吐出大约 600 万行。现在我明白了很多,这就是我在这里寻求一些查询优化建议的原因。

Pau*_*ite 10

问题是标量函数的使用。这些对每行每个引用执行一次,并且当前的内部实现使得这几乎与每次调用运行单独查询一样昂贵(1800 万次,600 万行,每行三个函数引用)。

一个快速的解决方案是将函数转换为内嵌表值函数。它们内嵌到查询文本中,与在查询优化之前扩展视图的方式非常相似。所以第一步是将函数转换为:

CREATE FUNCTION dbo.dFK_InLine
    (@dt as sql_variant)
RETURNS table
AS
RETURN
    SELECT
        ReturnValue =
        CASE 
            WHEN CA.datatype = 'int' 
                AND @dt >= 19000101 AND @dt <= 20451231
                THEN CONVERT(integer, @dt)
            WHEN (CA.datatype = 'int' AND @dt < 19000101) 
                OR (CA.datatype = 'int' AND @dt > 20451231)
                THEN 1
            WHEN @dt IS NULL 
            THEN 1
            WHEN (@dt < CAST('1900-01-01 00:00:00.000' AS DATETIME) 
                OR @dt > CAST('2045-12-31 11:59:59.000' AS DATETIME)) 
                AND CA.datatype = 'datetime' 
            THEN 1
            WHEN (@dt < CAST('1900-01-01' AS DATE) 
                OR @dt > CAST('2045-12-31' AS DATE)) 
                AND CA.datatype = 'date' 
            THEN 1
            ELSE FORMAT(CAST(@dt AS DATETIME2), 'yyyyMMdd')
        END
    FROM
    (
        VALUES
        (
            CONVERT(varchar(128), SQL_VARIANT_PROPERTY(@dt, 'BaseType'))
        )
    ) AS CA (datatype);
Run Code Online (Sandbox Code Playgroud)

然后修改源查询以使用它:

SELECT 
    SD.ReturnValue AS StartDate, -- INTEGER DATE!
    ED.ReturnValue AS EndDate,
    EFD.ReturnValue AS EffectiveDate
FROM OpenEnrollmentWindow oew
CROSS APPLY dbo.dFK_InLine(oew.StartDate) AS SD
CROSS APPLY dbo.dFK_InLine(oew.EndDate) AS ED
CROSS APPLY dbo.dFK_InLine(oew.EffectiveDate) AS EFD
INNER JOIN ProductYear py 
    ON oew.OrganizationProductYearID = py.ID
INNER JOIN Marketplace m 
    ON py.MarketplaceID = m.ID
INNER JOIN Organization o 
    ON m.OrganizationID = o.ID
INNER JOIN Consumer c 
    ON c.OrganizationID = o.ID
LEFT JOIN OpenEnrollmentWindowProduct oewp 
    ON oew.ID = oewp.OrganizationOpenEnrollmentWindowID
LEFT JOIN OpenEnrollmentWindowProductType oewpt 
    ON oew.ID = oewpt.OrganizationOpenEnrollmentWindowID;
Run Code Online (Sandbox Code Playgroud)

尽管如此,这仍然是一个非常......不寻常的策略,尤其是使用sql_variantCASE逻辑。通过重构设计以使用强类型和更传统的模型,您可能会获得更好的价值。