为什么按主键分组的插入会抛出主键约束违规错误?

Tre*_*vor 9 sql database sql-server

我有一个插入语句,抛出主键错误,但我不知道如何插入重复的键值.

首先,我使用主键创建临时表.

SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED //Note: I've tried committed and uncommited, neither materially affects the behavior. See screenshots below for proof.

IF (OBJECT_ID('TEMPDB..#P')) IS NOT NULL DROP TABLE #P;

CREATE TABLE #P(idIsbn INT NOT NULL PRIMARY KEY, price SMALLMONEY, priceChangedDate DATETIME);
Run Code Online (Sandbox Code Playgroud)

然后我从Price表中提取价格,按idIsbn分组,这是临时表中的主键.

INSERT  INTO #P(idIsbn, price, priceChangedDate)
SELECT  idIsbn ,
        MIN(lowestPrice) ,
        MIN(priceChangedDate)
FROM Price p
WHERE p.idMarketplace = 3100
GROUP BY p.idIsbn
Run Code Online (Sandbox Code Playgroud)

据我所知,idIsbn按照定义进行分组使其独一无二.价格表中的idIsbn是:[idIsbn] [int] NOT NULL.

但是每次我运行此查询时都会遇到此错误:

Violation of PRIMARY KEY constraint 'PK__#P________AED35F8119E85FC5'. Cannot insert duplicate key in object 'dbo.#P'. The duplicate key value is (1447858).
Run Code Online (Sandbox Code Playgroud)

注意:我对时间有很多疑问.我将选择此语句,按F5,不会发生错误.然后我会再次这样做,它会失败,然后我会一次又一次地运行它,它会在它再次失败之前成功几次.我想我所说的是,我找不到什么时候会成功,什么时候不成功的模式.

如果(A)我刚刚在插入之前创建了全新的表格,并且(B)我是按照设计为主键的列进行分组的,那么如何插入重复的行?

现在,我正在解决问题IGNORE_DUP_KEY = ON,但我真的想知道问题的根本原因.

这是我在SSMS窗口中实际看到的内容.没有更多,没有更少:

在此输入图像描述

@@版本是:

Microsoft SQL Server 2008 (SP3) - 10.0.5538.0 (X64) 
    Apr  3 2015 14:50:02 
    Copyright (c) 1988-2008 Microsoft Corporation
    Standard Edition (64-bit) on Windows NT 6.1 <X64> (Build 7601: Service Pack 1)
Run Code Online (Sandbox Code Playgroud)

执行计划: 在此输入图像描述

这是一个运行正常时的样子.在这里我使用READ COMMITTED,但无论是b/c我都会收到错误,无论我是将其读取还是未提交. 在此输入图像描述

这是另一个失败的例子,这次是READ COMMITTED.

在此输入图像描述

也:

  • 无论我是填充临时表还是持久表,我都会得到同样的错误.
  • 当我添加option (maxdop 1)到插入的末尾时,它似乎每次都失败,虽然我不能完全确定那个b/c我不能无限运行它.但似乎是这样.

这是价格表的定义.表有25M行.过去一小时内有108,529次更新.

CREATE TABLE [dbo].[Price](
    [idPrice] [int] IDENTITY(1,1) NOT NULL,
    [idIsbn] [int] NOT NULL,
    [idMarketplace] [int] NOT NULL,
    [lowestPrice] [smallmoney] NULL,
    [offers] [smallint] NULL,
    [priceDate] [smalldatetime] NOT NULL,
    [priceChangedDate] [smalldatetime] NULL,
 CONSTRAINT [pk_Price] PRIMARY KEY CLUSTERED 
(
    [idPrice] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY],
 CONSTRAINT [uc_idIsbn_idMarketplace] UNIQUE NONCLUSTERED 
(
    [idIsbn] ASC,
    [idMarketplace] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
Run Code Online (Sandbox Code Playgroud)

以及两个非聚集索引:

CREATE NONCLUSTERED INDEX [IX_Price_idMarketplace_INC_idIsbn_lowestPrice_priceDate] ON [dbo].[Price]
(
    [idMarketplace] ASC
)
INCLUDE (   [idIsbn],
    [lowestPrice],
    [priceDate]) WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
GO

CREATE NONCLUSTERED INDEX [IX_Price_idMarketplace_priceChangedDate_INC_idIsbn_lowestPrice] ON [dbo].[Price]
(
    [idMarketplace] ASC,
    [priceChangedDate] ASC
)
INCLUDE (   [idIsbn],
    [lowestPrice]) WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
GO
Run Code Online (Sandbox Code Playgroud)

Mar*_*ith 7

您还没有提供表格结构.

这是一个带有一些假设细节的repro,导致read committed问题(NB:现在你已经提供了我可以在你的情况下看到的定义更新到priceChangedDate列中的行将在IX_Price_idMarketplace_priceChangedDate_INC_idIsbn_lowestPrice索引中移动行,如果那是被搜索的那个)

连接1(设置表)

USE tempdb;

CREATE TABLE Price
  (
     SomeKey          INT PRIMARY KEY CLUSTERED,
     idIsbn           INT IDENTITY UNIQUE,
     idMarketplace    INT DEFAULT 3100,
     lowestPrice      SMALLMONEY DEFAULT $1.23,
     priceChangedDate DATETIME DEFAULT GETDATE()
  );

CREATE NONCLUSTERED INDEX ix
  ON Price(idMarketplace)
  INCLUDE (idIsbn, lowestPrice, priceChangedDate);

INSERT INTO Price
            (SomeKey)
SELECT number
FROM   master..spt_values
WHERE  number BETWEEN 1 AND 2000
       AND type = 'P'; 
Run Code Online (Sandbox Code Playgroud)

连接2

并发DataModifications将一行从搜索范围的开头移动(3100,1)到结尾(3100,2001),然后重复移回.

USE tempdb;

WHILE 1=1
BEGIN
UPDATE Price SET SomeKey = 2001 WHERE SomeKey = 1
UPDATE Price SET SomeKey = 1 WHERE SomeKey = 2001
END
Run Code Online (Sandbox Code Playgroud)

连接3(插入具有唯一约束的临时表)

USE tempdb;

CREATE TABLE #P
  (
     idIsbn           INT NOT NULL PRIMARY KEY,
     price            SMALLMONEY,
     priceChangedDate DATETIME
  );

WHILE 1 = 1
  BEGIN
      TRUNCATE TABLE #P

      INSERT INTO #P
                  (idIsbn,
                   price,
                   priceChangedDate)
      SELECT idIsbn,
             MIN(lowestPrice),
             MIN(priceChangedDate)
      FROM   Price p
      WHERE  p.idMarketplace = 3100
      GROUP  BY p.idIsbn
  END 
Run Code Online (Sandbox Code Playgroud)

在此输入图像描述

该计划没有聚合,因为对idIsbn存在唯一约束(对idIsbn的唯一约束,idMarketplace也可以),因此可以优化out group,因为没有重复值.

但是在读取提交的隔离级别时,只要读取该行就会释放共享行锁.因此,行可以移动位置并通过相同的搜索或扫描第二次读取.

索引ix未明确包含SomeKey为辅助键列,但由于它未声明为唯一SQL Server默认包含幕后的聚类键,因此更新该列值可以在其中移动行.

  • 很好的答案,这真的帮助我理解了这个潜在的问题.(我做了很多像这样的查询,但几乎总是来自单独的报告数据库/仓库,其中数据基本上是只读的,所以我从来没有在现实生活中碰到它.) (2认同)