Raj*_*Raj 7 sql-server sql-server-2008-r2 locking
我的任务是编写一个更新查询来更新一个包含超过 8.5 亿行数据的表。以下是表结构:
源表:
CREATE TABLE [dbo].[SourceTable1](
[ProdClassID] [varchar](10) NOT NULL,
[PriceListDate] [varchar](8) NOT NULL,
[PriceListVersion] [smallint] NOT NULL,
[MarketID] [varchar](10) NOT NULL,
[ModelID] [varchar](20) NOT NULL,
[VariantId] [varchar](20) NOT NULL,
[VariantType] [tinyint] NULL,
[Visibility] [tinyint] NULL,
CONSTRAINT [PK_SourceTable1] PRIMARY KEY CLUSTERED
(
[VariantId] ASC,
[ModelID] ASC,
[MarketID] ASC,
[ProdClassID] ASC,
[PriceListDate] ASC,
[PriceListVersion] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF,
IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON,
ALLOW_PAGE_LOCKS = ON, FILLFACTOR = 90)
)
CREATE TABLE [dbo].[SourceTable2](
[Id] [uniqueidentifier] NOT NULL,
[ProdClassID] [varchar](10) NULL,
[PriceListDate] [varchar](8) NULL,
[PriceListVersion] [smallint] NULL,
[MarketID] [varchar](10) NULL,
[ModelID] [varchar](20) NULL,
CONSTRAINT [PK_SourceTable2] PRIMARY KEY CLUSTERED
(
[Id] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF,
IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON,
ALLOW_PAGE_LOCKS = ON, FILLFACTOR = 91) ON [PRIMARY]
) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]
Run Code Online (Sandbox Code Playgroud)
SourceTable1
包含5200万行数据,SourceTable2
包含40万行数据。
这是TargetTable
结构
CREATE TABLE [dbo].[TargetTable](
[ChassisSpecificationId] [uniqueidentifier] NOT NULL,
[VariantId] [varchar](20) NOT NULL,
[VariantType] [tinyint] NULL,
[Visibility] [tinyint] NULL,
CONSTRAINT [PK_TargetTable] PRIMARY KEY CLUSTERED
(
[ChassisSpecificationId] ASC,
[VariantId] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF,
ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, FILLFACTOR = 71) ON [PRIMARY]
) ON [PRIMARY]
Run Code Online (Sandbox Code Playgroud)
这些表之间的关系如下:
SourceTable1.VariantID
与 TargetTable.VariantID
SourceTable2.ID
与 TargetTable.ChassisSpecificationId
更新要求如下:
VariantType
和Visibility
from的值,在列中具有最大值。SourceTable1
VariantID
PriceListVersion
ID
从列SourceTable2
,其中的值ModelID
,ProdClassID
,PriceListDate
并MarketID
匹配与的SourceTable1
。TargetTable
用匹配和匹配的值VariantType
和Visibility
位置
更新ChassisspecificationID
SourceTable2.ID
VariantID
SourceTable1.VariantID
挑战是在现场制作中进行此更新,锁定最少。这是我汇总的查询。
-- Check if Temp table already exists and drop if it does
IF EXISTS(
SELECT NULL
FROM tempdb.sys.tables
WHERE name LIKE '#CSpec%'
)
BEGIN
DROP TABLE #CSpec;
END;
-- Create Temp table to assign sequence numbers
CREATE Table #CSpec
(
RowID int,
ID uniqueidentifier,
PriceListDate VarChar(8),
ProdClassID VarChar(10),
ModelID VarChar(20),
MarketID Varchar(10)
);
-- Populate temp table
INSERT INTO #CSpec
SELECT ROW_NUMBER() OVER (ORDER BY MarketID) RowID,
CS.id,
CS.pricelistdate,
CS.prodclassid,
CS.modelid,
CS.marketid
FROM dbo.SourceTable2 CS
WHERE CS.MarketID IS NOT NULL;
-- Declare variables to hold values used for updates
DECLARE @min int,
@max int,
@ID uniqueidentifier,
@PriceListDate varchar(8),
@ProdClassID varchar(10),
@ModelID varchar(20),
@MarketID varchar(10);
-- Set minimum and maximum values for looping
SET @min = 1;
SET @max = (SELECT MAX(RowID) From #CSpec);
-- Populate other variables in a loop
WHILE @min <= @max
BEGIN
SELECT
@ID = ID,
@PriceListDate = PriceListDate,
@ProdClassID = ProdClassID,
@ModelID = ModelID,
@MarketID = MarketID
FROM #CSpec
WHERE RowID = @min;
-- Use CTE to get relevant values from SourceTable1
;WITH Variant_CTE AS
(
SELECT V.variantid,
V.varianttype,
V.visibility,
MAX(V.PriceListVersion) LatestPriceVersion
FROM SourceTable1 V
WHERE V.ModelID = @ModelID
AND V.ProdClassID = @ProdClassID
AND V.PriceListDate = @PriceListDate
AND V.MarketID = @MarketID
GROUP BY
V.variantid,
V.varianttype,
V.visibility
)
-- Update the TargetTable with the values obtained in the CTE
UPDATE SV
SET SV.VariantType = VC.VariantType,
SV.Visibility = VC.Visibility
FROM spec_variant SV
INNER JOIN TargetTable VC
ON SV.VariantId = VC.VariantId
WHERE SV.ChassisSpecificationId = @ID
AND SV.VariantType IS NULL
AND SV.Visibility IS NULL;
-- Increment the value of loop variable
SET @min = @min+1;
END
-- Clean up
DROP TABLE #CSpec
Run Code Online (Sandbox Code Playgroud)
通过对@max
变量的值进行硬编码,将迭代限制设置为 10 大约需要 30 秒。但是,当我将限制增加到 50 次迭代时,几乎需要 4 分钟才能完成。我担心 400,000 次迭代所花费的执行时间将在生产中运行数天。但是,如果TargetTable
没有被锁定,阻止用户访问它,这可能仍然是可以接受的。
欢迎所有输入。
谢谢,拉吉
为了加快速度,你可以尝试
此处的查询计划应该显示大量扫描,因为您正在执行的操作的索引很差。
目标表索引显示正常
另一个观察结果:uniqueidentifier 和 varchar 是聚集索引的糟糕选择(您的 PK 在这里):太宽,不会增加,至少集合比较的开销
编辑,另一个观察(感谢@Marian)
您的聚集索引通常很宽。每个非聚集索引都指向聚集索引,这也意味着一个巨大的NC索引
你可以很可能实现由重排群集PK相同的结果。
归档时间: |
|
查看次数: |
8587 次 |
最近记录: |