Sco*_*ain 5 sql optimization sql-server-2005
我正在尝试编写一个查询,该查询将通过表格并将帐户中的任何信用额应用于最旧的余额.我不知道如何在不使用游标的情况下做到这一点,我知道如果可能的话应该不惜一切代价避免使用游标,所以我来这里寻求帮助.
select * into #balances from [IDAT_AR_BALANCES] where amount > 0
select * into #credits from [IDAT_AR_BALANCES] where amount < 0
create index ba_ID on #balances (CLIENT_ID)
create index cr_ID on #credits (CLIENT_ID)
declare credit_cursor cursor for
select [CLIENT_ID], amount, cvtGUID from #credits
open credit_cursor
declare @client_id varchar(11)
declare @credit money
declare @balance money
declare @cvtGuidBalance uniqueidentifier
declare @cvtGuidCredit uniqueidentifier
fetch next from credit_cursor into @client_id, @credit, @cvtGuidCredit
while @@fetch_status = 0
begin
while(@credit < 0 and (select count(*) from #balances where @client_id = CLIENT_ID and amount <> 0) > 0)
begin
select top 1 @balance = amount, @cvtGuidBalance = cvtGuid from #balances where @client_id = CLIENT_ID and amount <> 0 order by AGING_DATE
set @credit = @balance + @credit
if(@credit > 0)
begin
update #balances set amount = @credit where cvtGuid = @cvtGuidBalance
set @credit = 0
end
else
begin
update #balances set amount = 0 where cvtGuid = @cvtGuidBalance
end
end
update #credits set amount = @credit where cvtGuid = @cvtGuidCredit
fetch next from credit_cursor into @client_id, @credit, @cvtGuidCredit
end
close credit_cursor
deallocate credit_cursor
delete #balances where AMOUNT = 0
delete #credits where AMOUNT = 0
truncate table [IDAT_AR_BALANCES]
insert [IDAT_AR_BALANCES] select * from #balances
insert [IDAT_AR_BALANCES] select * from #credits
drop table #balances
drop table #credits
Run Code Online (Sandbox Code Playgroud)
在10000个记录和1000个客户端的测试用例中,运行需要26秒,通过在CLIENT_ID上添加两个索引,我可以将数字降低到14秒.然而,对于我需要的东西,这仍然太慢,最终结果可能有多达10000个客户端和超过4,000,000条记录,因此运行时间很容易变成两位数分钟.
任何有关如何重新构造以移除光标的建议都将非常感激.
示例(更新以显示您在运行后可以获得多个积分):
before
cvtGuid client_id ammount AGING_DATE
xxxxxx 1 20.00 1/1/2011
xxxxxx 1 30.00 1/2/2011
xxxxxx 1 -10.00 1/3/2011
xxxxxx 1 5.00 1/4/2011
xxxxxx 2 20.00 1/1/2011
xxxxxx 2 15.00 1/2/2011
xxxxxx 2 -40.00 1/3/2011
xxxxxx 2 5.00 1/4/2011
xxxxxx 3 10.00 1/1/2011
xxxxxx 3 -20.00 1/2/2011
xxxxxx 3 5.00 1/3/2011
xxxxxx 3 -8.00 1/4/2011
after
cvtGuid client_id ammount AGING_DATE
xxxxxx 1 10.00 1/1/2011
xxxxxx 1 30.00 1/2/2011
xxxxxx 1 5.00 1/4/2011
xxxxxx 3 -5.00 1/2/2011
xxxxxx 3 -8.00 1/4/2011
Run Code Online (Sandbox Code Playgroud)
因此,它会将负面信用额应用于最早的正余额(示例中为客户1),如果在完成后没有剩余的正余额,则会留下剩余的负数(客户3),如果它们完全取消(这是在90%的时间内使用真实数据)它将完全删除记录(客户端2).
可以借助递归 CTE 来解决这个问题。
基本思想是这样的:
分别获取每个帐户的正值和负值总计 ( client_id)。
迭代每个账户并根据 的符号和绝对值“截取”两个总计之一的金额amount(即,绝不会“截取”相应总计超过其当前值)。应添加/减去相同的值amount。
更新后,删除那些amount变成0的行。
对于我的解决方案,我借用了 Lieven 的表变量定义(谢谢!),添加一列(cvtGuid,int出于演示目的而声明)和一行(原始示例中的最后一行,Lieven 的脚本中缺少) 。
/* preparing the demonstration data */
DECLARE @IDAT_AR_BALANCES TABLE (
cvtGuid int IDENTITY,
client_id INTEGER
, amount FLOAT
, date DATE
);
INSERT INTO @IDAT_AR_BALANCES
SELECT 1, 20.00, '1/1/2011'
UNION ALL SELECT 1, 30.00, '1/2/2011'
UNION ALL SELECT 1, -10.00, '1/3/2011'
UNION ALL SELECT 1, 5.00, '1/4/2011'
UNION ALL SELECT 2, 20.00, '1/1/2011'
UNION ALL SELECT 2, 15.00, '1/2/2011'
UNION ALL SELECT 2, -40.00, '1/3/2011'
UNION ALL SELECT 2, 5.00, '1/4/2011'
UNION ALL SELECT 3, 10.00, '1/1/2011'
UNION ALL SELECT 3, -20.00, '1/2/2011'
UNION ALL SELECT 3, 5.00, '1/3/2011'
UNION ALL SELECT 3, -8.00, '1/4/2011';
/* checking the original contents */
SELECT * FROM @IDAT_AR_BALANCES;
/* getting on with the job: */
WITH totals AS (
SELECT
/* 1) preparing the totals */
client_id,
total_pos = SUM(CASE WHEN amount > 0 THEN amount END),
total_neg = SUM(CASE WHEN amount < 0 THEN amount END)
FROM @IDAT_AR_BALANCES
GROUP BY client_id
),
refined AS (
/* 2) refining the original data with auxiliary columns:
* rownum - row numbers (unique within accounts);
* amount_to_discard_pos - the amount to discard `amount` completely if it's negative;
* amount_to_discard_neg - the amount to discard `amount` completely if it's positive
*/
SELECT
*,
rownum = ROW_NUMBER() OVER (PARTITION BY client_id ORDER BY date),
amount_to_discard_pos = CAST(CASE WHEN amount < 0 THEN -amount ELSE 0 END AS float),
amount_to_discard_neg = CAST(CASE WHEN amount > 0 THEN -amount ELSE 0 END AS float)
FROM @IDAT_AR_BALANCES
),
prepared AS (
/* 3) preparing the final table (using a recursive CTE) */
SELECT
cvtGuid = CAST(NULL AS int),
client_id,
amount = CAST(NULL AS float),
date = CAST(NULL AS date),
amount_update = CAST(NULL AS float),
running_balance_pos = total_pos,
running_balance_neg = total_neg,
rownum = CAST(0 AS bigint)
FROM totals
UNION ALL
SELECT
n.cvtGuid,
n.client_id,
n.amount,
n.date,
amount_update = CAST(
CASE
WHEN n.amount_to_discard_pos < p.running_balance_pos
THEN n.amount_to_discard_pos
ELSE p.running_balance_pos
END
+
CASE
WHEN n.amount_to_discard_neg > p.running_balance_neg
THEN n.amount_to_discard_neg
ELSE p.running_balance_neg
END
AS float),
running_balance_pos = CAST(p.running_balance_pos -
CASE
WHEN n.amount_to_discard_pos < p.running_balance_pos
THEN n.amount_to_discard_pos
ELSE p.running_balance_pos
END
AS float),
running_balance_neg = CAST(p.running_balance_neg -
CASE
WHEN n.amount_to_discard_neg > p.running_balance_neg
THEN n.amount_to_discard_neg
ELSE p.running_balance_neg
END
AS float),
n.rownum
FROM refined n
INNER JOIN prepared p ON n.client_id = p.client_id AND n.rownum = p.rownum + 1
)
/* -- some junk that I've forgotten to clean up,
SELECT * -- which you might actually want to use
FROM prepared -- to view the final prepared result set
WHERE rownum > 0 -- before actually running the update
ORDER BY client_id, rownum
*/
/* performing the update */
UPDATE t
SET amount = t.amount + u.amount_update
FROM @IDAT_AR_BALANCES t INNER JOIN prepared u ON t.cvtGuid = u.cvtGuid
OPTION (MAXRECURSION 0);
/* checking the contents after UPDATE */
SELECT * FROM @IDAT_AR_BALANCES;
/* deleting the eliminated amounts */
DELETE FROM @IDAT_AR_BALANCES WHERE amount = 0;
/* checking the contents after DELETE */
SELECT * FROM @IDAT_AR_BALANCES;
Run Code Online (Sandbox Code Playgroud)
更新
amount正如 Lieven 所正确建议的那样(再次感谢您!),您可以先删除帐户中加起来为 0 的所有行,然后更新其他行。这将提高整体性能,因为正如您所说,大多数数据的数量加起来为 0。
以下是 Lieven 删除“零帐户”解决方案的变体:
DELETE FROM @IDAT_AR_BALANCES
WHERE client_id IN (
SELECT client_id
FROM @IDAT_AR_BALANCES
GROUP BY client_id
HAVING SUM(amount) = 0
)
Run Code Online (Sandbox Code Playgroud)
但请记住,DELETE更新后仍然需要,因为更新可能会将某些值重置amount为 0。如果我是你,我可能会考虑创建一个 FOR UPDATE 触发器,它会自动删除其中的行amount = 0。这样的解决方案并不总是可以接受,但有时是可以的。这取决于您还可以用您的数据做什么。它还可能取决于它是否只是您的项目或还有其他维护者(他们不喜欢行“神奇地”并意外消失)。
| 归档时间: |
|
| 查看次数: |
3895 次 |
| 最近记录: |