SQL Server - 多个运行总计

Avn*_*rSo 8 performance sql-server query scalability

我有一个包含事务的基表,我需要创建一个带有运行总计的表。我需要它们是每个帐户,并且每个帐户都有一些运行总计(取决于交易类型),在其中,每个子帐户都有一些运行总计。

我的基表有这些字段(或多或少):

AccountID  |  SubAccountID   |  TransactionType  |  TransactionAmount
Run Code Online (Sandbox Code Playgroud)

考虑到我每个 Account/TransactionType 有大约 4 种类型的运行总计,每个 Account/SubAccount/TransactionType 有 2 个以上的运行总计,我有大约 200 万个账户,每个账户有大约 10 个子账户,我收到了大约 1 万笔交易每分钟(在最大负载下),你会怎么做?

这也是必须通过 SQL 作业异步运行,创建聚合而不是事务本身的一部分。

我在这里使用光标很困难 - 这需要太长时间。我真的很感激任何或多或少相同的建议/文章。

Pau*_*ite 7

异步意味着运行总计不需要始终完全准确,或者您的数据更改模式使得一次性运行总计构建在下一次加载之前将有效且准确。不管怎样,我相信你已经考虑过那部分了,所以我不会在这一点上工作。

高性能、受支持的方法的主要选项是 SQLCLR 函数/过程,或UPDATE基于 Hugo Kornelis 的基于集合的迭代方法。可以在此处找到 SQLCLR 方法(在过程中实现,但相当容易翻译)。

我没能在网上找到 Hugo 的方法,但在优秀的 MVP Deep Dives (Volume 1) 中有详细介绍。说明 Hugo 方法的示例代码(从我在另一个网站上的一篇帖子中复制,您可能没有登录)如下所示:

-- A work table to hold the reformatted data, and
-- ultimately, the results
CREATE  TABLE #Work
    (
    Acct_No         VARCHAR(20) NOT NULL,
    MonthDate       DATETIME NOT NULL,
    MonthRate       DECIMAL(19,12) NOT NULL,
    Amount          DECIMAL(19,12) NOT NULL,
    InterestAmount  DECIMAL(19,12) NOT NULL,
    RunningTotal    DECIMAL(19,12) NOT NULL,
    RowRank         BIGINT NOT NULL
    );

-- Prepare the set-based iteration method
WITH    Accounts
AS      (
        -- Get a list of the account numbers
        SELECT  DISTINCT Acct_No 
        FROM    #Refunds
        ),
        Rates
AS      (
        -- Apply all the accounts to all the rates
        SELECT  A.Acct_No,
                R.[Year],
                R.[Month],
                MonthRate = R.InterestRate / 12
        FROM    #InterestRates R
        CROSS 
        JOIN    Accounts A
        ),
        BaseData
AS      (
        -- The basic data we need to work with
        SELECT  Acct_No = ISNULL(R.Acct_No,''),
                MonthDate = ISNULL(DATEADD(MONTH, R.[Month], DATEADD(YEAR, R.[year] - 1900, 0)), 0),
                R.MonthRate,
                Amount = ISNULL(RF.Amount,0),
                InterestAmount = ISNULL(RF.Amount,0) * R.MonthRate,
                RunningTotal = ISNULL(RF.Amount,0)
        FROM    Rates R
        LEFT
        JOIN    #Refunds RF
                ON  RF.Acct_No = R.Acct_No
                AND RF.[Year] = R.[Year]
                AND RF.[Month] = R.[Month]
        )
-- Basic data plus a rank id, numbering the rows by MonthDate, and resetting to 1 for each new Account
INSERT  #Work
        (Acct_No, MonthDate, MonthRate, Amount, InterestAmount, RunningTotal, RowRank)
SELECT  BD.Acct_No, BD.MonthDate, BD.MonthRate, BD.Amount, BD.InterestAmount, BD.RunningTotal,
        RowRank = RANK() OVER (PARTITION BY BD.Acct_No ORDER BY MonthDate)
FROM    BaseData BD;

-- An index to speed the next stage (different from that used with the Quirky Update method)
CREATE UNIQUE CLUSTERED INDEX nc1 ON #Work (RowRank, Acct_No);

-- Iteration variables
DECLARE @Rank       BIGINT,
        @RowCount   INTEGER;

-- Initialize
SELECT  @Rank = 1,
        @RowCount = 1;

-- This is the iteration bit, processes a rank id per iteration
-- The number of rows processed with each iteration is equal to the number of groups in the data
-- More groups --> greater efficiency
WHILE   (1 = 1)
BEGIN
        SET @Rank = @Rank + 1;

        -- Set-based update with running totals for the current rank id
        UPDATE  This
        SET     InterestAmount = (Previous.RunningTotal + This.Amount) * This.MonthRate,
                RunningTotal = Previous.RunningTotal + This.Amount + (Previous.RunningTotal + This.Amount) * This.MonthRate
        FROM    #Work This
        JOIN    #Work Previous
                ON  Previous.Acct_No = This.Acct_No
                AND Previous.RowRank = @Rank - 1
        WHERE   This.RowRank = @Rank;

        IF  (@@ROWCOUNT = 0) BREAK;
END;

-- Show the results in natural order
SELECT  *
FROM    #Work
ORDER   BY
        Acct_No, RowRank;
Run Code Online (Sandbox Code Playgroud)

在 SQL Server 2012 中,您可以使用窗口函数扩展,例如SUM OVER (ORDER BY).


gbn*_*gbn 5

我不确定你为什么想要异步,但几个索引视图听起来就像这里的票。如果您想要每个组的简单 SUM,即:定义运行总计。

如果你真的想要异步,每秒 160 个新行你的运行总数将永远是过时的。异步意味着没有触发器或索引视图


A-K*_*A-K 5

无论您是使用游标还是三角形连接,计算运行总计都非常缓慢。非规范化非常诱人,将运行总计存储在列中,特别是如果您经常选择它。但是,像往常一样,在进行非规范化时,您需要保证非规范化数据的完整性。幸运的是,您可以通过约束保证运行总计的完整性——只要您的所有约束都是可信的,您的所有运行总计都是正确的。

此外,通过这种方式,您可以轻松确保当前余额(运行总计)永远不会为负 - 通过其他方法执行也可能非常缓慢。以下脚本演示了该技术。

    CREATE TABLE Data.Inventory(InventoryID INT NOT NULL IDENTITY,
      ItemID INT NOT NULL,
      ChangeDate DATETIME NOT NULL,
      ChangeQty INT NOT NULL,
      TotalQty INT NOT NULL,
      PreviousChangeDate DATETIME NULL,
      PreviousTotalQty INT NULL,
      CONSTRAINT PK_Inventory PRIMARY KEY(ItemID, ChangeDate),
      CONSTRAINT UNQ_Inventory UNIQUE(ItemID, ChangeDate, TotalQty),
      CONSTRAINT UNQ_Inventory_Previous_Columns UNIQUE(ItemID, PreviousChangeDate, PreviousTotalQty),
      CONSTRAINT FK_Inventory_Self FOREIGN KEY(ItemID, PreviousChangeDate, PreviousTotalQty)
        REFERENCES Data.Inventory(ItemID, ChangeDate, TotalQty),
      CONSTRAINT CHK_Inventory_Valid_TotalQty CHECK(TotalQty >= 0 AND (TotalQty = COALESCE(PreviousTotalQty, 0) + ChangeQty)),
      CONSTRAINT CHK_Inventory_Valid_Dates_Sequence CHECK(PreviousChangeDate < ChangeDate),
      CONSTRAINT CHK_Inventory_Valid_Previous_Columns CHECK((PreviousChangeDate IS NULL AND PreviousTotalQty IS NULL)
                OR (PreviousChangeDate IS NOT NULL AND PreviousTotalQty IS NOT NULL))
    );
    GO
    -- beginning of inventory for item 1
    INSERT INTO Data.Inventory(ItemID,
      ChangeDate,
      ChangeQty,
      TotalQty,
      PreviousChangeDate,
      PreviousTotalQty)
    VALUES(1, '20090101', 10, 10, NULL, NULL);
    -- cannot begin the inventory for the second time for the same item 1
    INSERT INTO Data.Inventory(ItemID,
      ChangeDate,
      ChangeQty,
      TotalQty,
      PreviousChangeDate,
      PreviousTotalQty)
    VALUES(1, '20090102', 10, 10, NULL, NULL);


Msg 2627, Level 14, State 1, Line 10
Violation of UNIQUE KEY constraint 'UNQ_Inventory_Previous_Columns'. Cannot insert duplicate key in object 'Data.Inventory'.
The statement has been terminated.

-- add more
DECLARE @ChangeQty INT;
SET @ChangeQty = 5;
INSERT INTO Data.Inventory(ItemID,
  ChangeDate,
  ChangeQty,
  TotalQty,
  PreviousChangeDate,
  PreviousTotalQty)
SELECT TOP 1 ItemID, '20090103', @ChangeQty, TotalQty + @ChangeQty, ChangeDate, TotalQty
  FROM Data.Inventory
  WHERE ItemID = 1
  ORDER BY ChangeDate DESC;

SET @ChangeQty = 3;
INSERT INTO Data.Inventory(ItemID,
  ChangeDate,
  ChangeQty,
  TotalQty,
  PreviousChangeDate,
  PreviousTotalQty)
SELECT TOP 1 ItemID, '20090104', @ChangeQty, TotalQty + @ChangeQty, ChangeDate, TotalQty
  FROM Data.Inventory
  WHERE ItemID = 1
  ORDER BY ChangeDate DESC;

SET @ChangeQty = -4;
INSERT INTO Data.Inventory(ItemID,
  ChangeDate,
  ChangeQty,
  TotalQty,
  PreviousChangeDate,
  PreviousTotalQty)
SELECT TOP 1 ItemID, '20090105', @ChangeQty, TotalQty + @ChangeQty, ChangeDate, TotalQty
  FROM Data.Inventory
  WHERE ItemID = 1
  ORDER BY ChangeDate DESC;

-- try to violate chronological order

SET @ChangeQty = 5;
INSERT INTO Data.Inventory(ItemID,
  ChangeDate,
  ChangeQty,
  TotalQty,
  PreviousChangeDate,
  PreviousTotalQty)
SELECT TOP 1 ItemID, '20081231', @ChangeQty, TotalQty + @ChangeQty, ChangeDate, TotalQty
  FROM Data.Inventory
  WHERE ItemID = 1
  ORDER BY ChangeDate DESC;

Msg 547, Level 16, State 0, Line 4
The INSERT statement conflicted with the CHECK constraint "CHK_Inventory_Valid_Dates_Sequence". The conflict occurred in database "Test", table "Data.Inventory".
The statement has been terminated.


SELECT ChangeDate,
  ChangeQty,
  TotalQty,
  PreviousChangeDate,
  PreviousTotalQty
FROM Data.Inventory ORDER BY ChangeDate;

ChangeDate              ChangeQty   TotalQty    PreviousChangeDate      PreviousTotalQty
----------------------- ----------- ----------- ----------------------- -----
2009-01-01 00:00:00.000 10          10          NULL                    NULL
2009-01-03 00:00:00.000 5           15          2009-01-01 00:00:00.000 10
2009-01-04 00:00:00.000 3           18          2009-01-03 00:00:00.000 15
2009-01-05 00:00:00.000 -4          14          2009-01-04 00:00:00.000 18


-- try to change a single row, all updates must fail
UPDATE Data.Inventory SET ChangeQty = ChangeQty + 2 WHERE InventoryID = 3;
UPDATE Data.Inventory SET TotalQty = TotalQty + 2 WHERE InventoryID = 3;
-- try to delete not the last row, all deletes must fail
DELETE FROM Data.Inventory WHERE InventoryID = 1;
DELETE FROM Data.Inventory WHERE InventoryID = 3;

-- the right way to update

DECLARE @IncreaseQty INT;
SET @IncreaseQty = 2;
UPDATE Data.Inventory SET ChangeQty = ChangeQty + CASE WHEN ItemID = 1 AND ChangeDate = '20090103' THEN @IncreaseQty ELSE 0 END,
  TotalQty = TotalQty + @IncreaseQty,
  PreviousTotalQty = PreviousTotalQty + CASE WHEN ItemID = 1 AND ChangeDate = '20090103' THEN 0 ELSE @IncreaseQty END
WHERE ItemID = 1 AND ChangeDate >= '20090103';

SELECT ChangeDate,
  ChangeQty,
  TotalQty,
  PreviousChangeDate,
  PreviousTotalQty
FROM Data.Inventory ORDER BY ChangeDate;

ChangeDate              ChangeQty   TotalQty    PreviousChangeDate      PreviousTotalQty
----------------------- ----------- ----------- ----------------------- ----------------
2009-01-01 00:00:00.000 10          10          NULL                    NULL
2009-01-03 00:00:00.000 7           17          2009-01-01 00:00:00.000 10
2009-01-04 00:00:00.000 3           20          2009-01-03 00:00:00.000 17
2009-01-05 00:00:00.000 -4          16          2009-01-04 00:00:00.000 20
Run Code Online (Sandbox Code Playgroud)

复制自我的博客