Zan*_*ane 3 sql-server-2008 sql-server
我有一个查询,用于填充聚合表以进行报告。该查询来自我工作的公司的另一位开发人员,但我的工作是使其快速运行。到目前为止,我所有的早期尝试都失败了。我已经用这个查询尝试了几件事,这就是我目前所处的位置。我已经减少了大约半小时的加载时间,但我被卡住了,我想我可能只需要重新做整个事情。我希望这里有人能看到我遗漏了什么,并就如何解决这个查询给我一些指示。
SELECT P.CompanyID,
P.CompanyName,
P.StoreID,
P.StoreName,
P.ReportDate,
Isnull((SELECT Sum(FT.GrossSales - Isnull(FP.PaymentAmount, 0)) AS PullNet
FROM FactSalesTransaction AS FT
LEFT JOIN (SELECT TransactionID,
DimStoreID,
DimBusinessDateID,
Sum(PaymentAmount) AS PaymentAmount
FROM FactSalesPayment
WHERE DimPaymentTypeID <> 2
AND ModStatusFlg <> 'D'
GROUP BY TransactionID,
DimStoreID,
DimBusinessDateID) AS FP
ON FP.TransactionID = FT.TransactionID
AND FP.DimStoreID = FT.DimStoreID
INNER JOIN DimCalendar AS C
ON FT.DimBusinessDateID = C.DimCalendarID
AND FP.DimBusinessDateID = C.DimCalendarID AND C.CalendarDate >= '12/4/2012'
WHERE FT.DimStoreID = P.DimStoreID
AND FT.DimBusinessDateID = P.DimBusinessDateID
AND FT.ModStatusFlg <> 'D'), 0) AS StoreCash,
SR.CashDeposit AS StoreResp,
SN.StoreNet,
P.DimEmployeeID AS EmpID,
P.EmpName,
P.RegisterID,
P.PullNumber,
Isnull((SELECT Sum(FT.GrossSales - Isnull(FP.PaymentAmount, 0)) AS PullNet
FROM FactSalesTransaction AS FT
LEFT JOIN (SELECT TransactionID,
DimStoreID,
DimBusinessDateID,
Sum(PaymentAmount) AS PaymentAmount
FROM FactSalesPayment
WHERE DimPaymentTypeID <> 2
AND ModStatusFlg <> 'D'
GROUP BY TransactionID,
DimStoreID,
DimBusinessDateID) AS FP
ON FP.TransactionID = FT.TransactionID
AND FP.DimStoreID = FT.DimStoreID
INNER JOIN DimCalendar AS C
ON FT.DimBusinessDateID = C.DimCalendarID
AND FP.DimBusinessDateID = C.DimCalendarID AND C.CalendarDate >= '12/4/2012'
WHERE FT.DimStoreID = P.DimStoreID
AND FT.DimRegisterID = P.DimRegisterID
AND FT.TransactionDateTime BETWEEN P.PullDrawerStartTime AND P.PullDrawerEndTime
AND FT.ModStatusFlg <> 'D'), 0) AS PullCash,
P.PullResp + Isnull((SELECT Sum(SkimAmount)
FROM FactSkims
WHERE DimStoreID = P.DimStoreID
AND DimRegisterID = P.DimRegisterID
AND SkimDateTime BETWEEN P.PullDrawerStartTime AND P.PullDrawerEndTime), 0) AS PullResp,
Isnull((SELECT Sum(NetSales) AS PullNet
FROM FactSalesTransaction AS FT
INNER JOIN DimCalendar AS C
ON FT.DimBusinessDateID = C.DimCalendarID AND C.CalendarDate >= '12/4/2012'
WHERE DimStoreID = P.DimStoreID
AND DimRegisterID = P.DimRegisterID
AND TransactionDateTime BETWEEN P.PullDrawerStartTime AND P.PullDrawerEndTime
AND ModStatusFlg <> 'D'), 0) AS PullNet
FROM (SELECT C.CompanyID,
C.CompanyName,
S.StoreID,
S.StoreName,
F.DimEmployeeID,
E.FirstName + ' ' + E.LastName AS EmpName,
CASE
WHEN F.PullDrawerStartTime <> '1900-01-01' THEN F.PullDrawerStartTime
ELSE Isnull(Cast((SELECT TOP 1 Dateadd(SECOND, 1, PullDrawerEndTime)
FROM FactPullDrawer
WHERE PullDrawerEndTime < F.PullDrawerEndTime
AND DimStoreID = F.DimStoreID
AND DimRegisterID = F.DimRegisterID
AND DimBusinessDateID = F.DimBusinessDateID
ORDER BY PullDrawerEndTime DESC) AS DATETIME), BD.CalendarDate + Isnull(Cast(Cast(ST.SiteSettingValue AS TIME) AS DATETIME), Cast('4:00:00 AM' AS DATETIME)))
END AS PullDrawerStartTime,
F.PullDrawerEndTime,
BD.CalendarDate AS ReportDate,
R.RegisterID,
R.DimRegisterID,
(SELECT Count(PullDrawerEndTime)
FROM FactPullDrawer
WHERE PullDrawerEndTime < F.PullDrawerEndTime
AND DimStoreID = F.DimStoreID
AND DimRegisterID = F.DimRegisterID
AND DimBusinessDateID = F.DimBusinessDateID) + 1 AS PullNumber,
Isnull(F.Amount, 0) AS PullResp,
F.DimStoreID,
F.DimBusinessDateID
FROM FactPullDrawer AS F
INNER JOIN DimCompany AS C
ON C.DimCompanyID = F.DimCompanyID
INNER JOIN DimStore AS S
ON S.DimStoreID = F.DimStoreID
INNER JOIN DimCalendar AS BD
ON BD.DimCalendarID = F.DimBusinessDateID
AND BD.CalendarDate >= '12/4/2012'
INNER JOIN DimEmployee AS E
ON F.DimEmployeeID = E.DimEmployeeID
INNER JOIN DimRegister AS R
ON R.DimRegisterID = F.DimRegisterID
LEFT JOIN DimSiteSettings AS ST
ON S.StoreID = ST.StoreID
AND C.CompanyID = ST.CompanyID
AND ST.SiteSettingFieldID = 1412) AS P
INNER JOIN (SELECT DimStoreID,
DimBusinessDateID,
Sum(NetSales) AS StoreNet
FROM FactSalesTransaction
WHERE ModStatusFlg <> 'D'
GROUP BY DimStoreID,
DimBusinessDateID) AS SN
ON SN.DimStoreID = P.DimStoreID
AND SN.DimBusinessDateID = P.DimBusinessDateID
INNER JOIN (SELECT CompanyID,
StoreID,
ReportDate,
Sum(ValTotal) AS CashDeposit
FROM AgtAccountingReport
WHERE ReportCatOrder = 7
AND ReportElementOrder < 100
AND ReportElementOrder NOT IN ( 7, 9, 10, 16,17, 18, 19, 20, 21 )
AND ReportDate >= '10/28/2012'
GROUP BY CompanyID,
StoreID,
ReportDate) AS SR
ON SR.CompanyID = P.CompanyID
AND SR.StoreID = P.StoreID
AND SR.ReportDate = P.ReportDate
Run Code Online (Sandbox Code Playgroud)
我在想所有嵌套SELECT
的,这就是为什么我认为我会从头开始。任何帮助,将不胜感激。
好的,这就是我实际所做的,使这个查询过去运行了大约一个半小时,不到一分钟。首先,我做了更多的挖掘,以确切了解这是在做什么。在查询的最外面部分有几个主要的子选择。他们看起来像这样。
Isnull((SELECT Sum(FT.GrossSales - Isnull(FP.PaymentAmount, 0)) AS PullNet
FROM FactSalesTransaction AS FT
LEFT JOIN (SELECT TransactionID,
DimStoreID,
DimBusinessDateID,
Sum(PaymentAmount) AS PaymentAmount
FROM FactSalesPayment
WHERE DimPaymentTypeID <> 2
AND ModStatusFlg <> 'D'
GROUP BY TransactionID,
DimStoreID,
DimBusinessDateID) AS FP
ON FP.TransactionID = FT.TransactionID
AND FP.DimStoreID = FT.DimStoreID
INNER JOIN DimCalendar AS C
ON FT.DimBusinessDateID = C.DimCalendarID
AND FP.DimBusinessDateID = C.DimCalendarID AND C.CalendarDate >= '12/4/2012'
WHERE FT.DimStoreID = P.DimStoreID
AND FT.DimBusinessDateID = P.DimBusinessDateID
AND FT.ModStatusFlg <> 'D'), 0) AS StoreCash
Run Code Online (Sandbox Code Playgroud)
这些选择语句连接了我最大的两个表。FactSalesTransaction(1.05 亿条记录)到 FactSalesPayment(1.02 亿条记录),它是在一个选择中的一个选择中这样做的。这实质上意味着对于返回的每一行,它都在执行此查询。那么这个查询通常会运行大约 7 天的数据,因此返回大约 19,000 条记录。这意味着对这些海量表的 3 个子选择需要执行 19,000 次。宾果游戏 我想我已经找到了我的表现损失的地方。所以我将这些查询切换到左连接。没有什么复杂的事情,所以他们只需要加入一次。要替换的左连接看起来像这样。
LEFT JOIN (SELECT Sum(FT.GrossSales - Isnull(FP.PaymentAmount, 0)) AS StoreCash, FT.DimStoreID, FT.DimBusinessDateID
FROM FactSalesTransaction AS FT
LEFT JOIN (SELECT TransactionID,
DimStoreID,
Sum(PaymentAmount) AS PaymentAmount
FROM FactSalesPayment
WHERE DimPaymentTypeID <> 2
AND ModStatusFlg <> 'D'
GROUP BY TransactionID,
DimStoreID) AS FP
ON FP.TransactionID = FT.TransactionID
AND FP.DimStoreID = FT.DimStoreID
WHERE FT.ModStatusFlg <> 'D'
GROUP BY FT.DimStoreID, FT.DimBusinessDateID) AS FP
ON FP.DimStoreID = P.DimStoreID
AND FP.DimBusinessDateID = P.DimBusinessDateID
Run Code Online (Sandbox Code Playgroud)
正如您所看到的,我对选择本身并没有太大改变,只是将其更改为不运行 19,000 次。我做的下一件事是将查询更改为存储过程,其中获取用户的日期范围,或者在这种情况下 ETL 过程给出(通常为 7 天后)从 DimCalendar 中选择当天的 DimCalendarID,以便查询使用整数日期时间和整体有较少的记录要加入。使最终查询看起来像这样。
CREATE PROCEDURE [dbo].[NoneOfYourBusinessWhatINamedIt]
@ETLLoadDate DATETIME
AS
BEGIN
DECLARE @DimCal Int = (SELECT DimCalendarID FROM DimCalendar WHERE CalendarDate = @ETLLoadDate)
END
SELECT P.CompanyID,
P.CompanyName,
P.StoreID,
P.StoreName,
P.ReportDate,
FP.StoreCash,
SR.CashDeposit AS StoreResp,
SN.StoreNet,
P.DimEmployeeID AS EmpID,
P.EmpName,
P.RegisterID,
P.PullNumber,
ISNULL(SUM(PC.PullCash), 0) AS PullCash,
P.PullResp AS PullResp,
ISNULL(SUM(PN.PullNet),0) AS PullNet
FROM (SELECT C.CompanyID,
C.CompanyName,
S.StoreID,
S.StoreName,
F.DimEmployeeID,
E.FirstName + ' ' + E.LastName AS EmpName,
CASE
WHEN F.PullDrawerStartTime <> '1900-01-01' THEN F.PullDrawerStartTime
ELSE Isnull(Cast((SELECT TOP 1 Dateadd(SECOND, 1, PullDrawerEndTime)
FROM FactPullDrawer
WHERE PullDrawerEndTime < F.PullDrawerEndTime
AND DimStoreID = F.DimStoreID
AND DimRegisterID = F.DimRegisterID
AND DimBusinessDateID = F.DimBusinessDateID
--AND DimBusinessDateID = @DimCal
ORDER BY PullDrawerEndTime DESC) AS DATETIME), BD.CalendarDate + Isnull(Cast(Cast(ST.SiteSettingValue AS TIME) AS DATETIME), Cast('4:00:00 AM' AS DATETIME)))
END AS PullDrawerStartTime,
F.PullDrawerEndTime,
BD.CalendarDate AS ReportDate,
R.RegisterID,
R.DimRegisterID,
(SELECT Count(PullDrawerEndTime)
FROM FactPullDrawer
WHERE PullDrawerEndTime < F.PullDrawerEndTime
AND DimStoreID = F.DimStoreID
AND DimRegisterID = F.DimRegisterID
AND DimBusinessDateID = F.DimBusinessDateID) + 1 AS PullNumber,
Isnull(F.Amount, 0) AS PullResp,
F.DimStoreID,
F.DimBusinessDateID
FROM FactPullDrawer AS F
INNER JOIN DimCompany AS C
ON C.DimCompanyID = F.DimCompanyID
INNER JOIN DimStore AS S
ON S.DimStoreID = F.DimStoreID
INNER JOIN DimCalendar AS BD
ON BD.DimCalendarID = F.DimBusinessDateID
INNER JOIN DimEmployee AS E
ON F.DimEmployeeID = E.DimEmployeeID
INNER JOIN DimRegister AS R
ON R.DimRegisterID = F.DimRegisterID
LEFT JOIN DimSiteSettings AS ST
ON S.StoreID = ST.StoreID
AND C.CompanyID = ST.CompanyID
AND ST.SiteSettingFieldID = 1412
WHERE F.DimBusinessDateID = @DimCal) AS P
INNER JOIN (SELECT DimStoreID,
DimBusinessDateID,
Sum(NetSales) AS StoreNet
FROM FactSalesTransaction
WHERE ModStatusFlg <> 'D' AND DimBusinessDateID = @DimCal
GROUP BY DimStoreID,
DimBusinessDateID) AS SN
ON SN.DimStoreID = P.DimStoreID
AND SN.DimBusinessDateID = P.DimBusinessDateID
INNER JOIN (SELECT CompanyID,
StoreID,
ReportDate,
Sum(ValTotal) AS CashDeposit
FROM AgtAccountingReport
WHERE ReportCatOrder = 7
AND ReportElementOrder < 100
AND ReportElementOrder NOT IN ( 7, 9, 10, 16,
17, 18, 19, 20, 21 )
GROUP BY CompanyID,
StoreID,
ReportDate) AS SR
ON SR.CompanyID = P.CompanyID
AND SR.StoreID = P.StoreID
AND SR.ReportDate = P.ReportDate
LEFT JOIN (SELECT Sum(FT.GrossSales - Isnull(FP.PaymentAmount, 0)) AS StoreCash, FT.DimStoreID, FT.DimBusinessDateID
FROM FactSalesTransaction AS FT
LEFT JOIN (SELECT TransactionID,
DimStoreID,
Sum(PaymentAmount) AS PaymentAmount
FROM FactSalesPayment
WHERE DimPaymentTypeID <> 2
AND ModStatusFlg <> 'D'
AND DimBusinessDateID = @DimCal
GROUP BY TransactionID,
DimStoreID) AS FP
ON FP.TransactionID = FT.TransactionID
AND FP.DimStoreID = FT.DimStoreID
WHERE FT.ModStatusFlg <> 'D' AND DimBusinessDateID = @DimCal
GROUP BY FT.DimStoreID, FT.DimBusinessDateID) AS FP
ON FP.DimStoreID = P.DimStoreID
AND FP.DimBusinessDateID = P.DimBusinessDateID
LEFT JOIN(SELECT Sum(FT.GrossSales - Isnull(FP.PaymentAmount, 0)) AS PullCash, FT.DimStoreID, FT.DimRegisterID, FT.TransactionDateTime
FROM FactSalesTransaction AS FT
LEFT JOIN (SELECT TransactionID,
DimStoreID,
Sum(PaymentAmount) AS PaymentAmount
FROM FactSalesPayment
WHERE DimPaymentTypeID <> 2
AND ModStatusFlg <> 'D'
AND DimBusinessDateID = @DimCal
GROUP BY TransactionID,
DimStoreID) AS FP
ON FP.TransactionID = FT.TransactionID
AND FP.DimStoreID = FT.DimStoreID
WHERE FT.ModStatusFlg <> 'D' AND DimBusinessDateID = @DimCal
GROUP BY FT.DimStoreID, FT.DimRegisterID, FT.TransactionDateTime) AS PC
ON PC.DimStoreID = P.DimStoreID
AND PC.DimRegisterID = P.DimRegisterID
AND PC.TransactionDateTime BETWEEN P.PullDrawerStartTime AND P.PullDrawerEndTime
LEFT JOIN (SELECT Sum(SkimAmount) as SkimAmount, DimStoreID, DimRegisterID, SkimDateTime
FROM FactSkims
GROUP BY DimStoreID, DimRegisterID, SkimDateTime) AS FS
ON FS.DimStoreID = P.DimStoreID
AND FS.DimRegisterID = P.DimRegisterID
AND FS.SkimDateTime BETWEEN P.PullDrawerStartTime AND P.PullDrawerEndTime
LEFT JOIN (SELECT Sum(NetSales) AS PullNet, DimStoreID, DimRegisterID, TransactionDateTime
FROM FactSalesTransaction
WHERE ModStatusFlg <> 'D' AND DimBusinessDateID = @DimCal
GROUP BY DimStoreID, DimRegisterID, TransactionDateTime) AS PN
ON PN.DimStoreID = P.DimStoreID
AND PN.DimRegisterID = P.DimRegisterID
AND PN.TransactionDateTime BETWEEN P.PullDrawerStartTime AND P.PullDrawerEndTime
AND PC.TransactionDateTime = PN.TransactionDateTime
group by P.CompanyID, P.CompanyName, P.StoreID,P.StoreName,P.ReportDate,FP.StoreCash,SR.CashDeposit,SN.StoreNet,P.DimEmployeeID,P.EmpName,P.RegisterID,P.PullNumber,P.PullResp
Run Code Online (Sandbox Code Playgroud)
在对速度增加进行测试以查看索引空间是否值得之后。我在 ModStatusFlag 和 DimBusinessDateID 上添加了两个非聚集索引,其中包括此查询请求的其他列。在 FactSalesTransaction 和 FactSalesPayment 上。我可能可以做更多的事情来清理它并让它运行得更快,但是性能提升会很小,而且目前有更大的鱼可以煎。长话短说要小心你的子选择语句。
归档时间: |
|
查看次数: |
6704 次 |
最近记录: |