在多个集合中查找一个匹配集

Tvd*_*vdH 7 t-sql sql-server

我有一个包含许多集合的表(@ t1).我想在@ t1中找到@ t2的完美匹配.

在此示例中,所需结果为1.

(Set 1匹配完美,set 2包含三个元素,而@ t2只包含两个元素,set 3包含的元素少于@ t2,set 4包含@ t2中不允许的NULL元素,set 5包含正确数量的元素但其中一个要素不相等.)

DECLARE @t1 TABLE (id INT, data INT);
DECLARE @t2 TABLE (data INT PRIMARY KEY);

INSERT INTO @t1 (id, data)
VALUES
(1, 1),
(1, 2),
(2, 1),
(2, 2),
(2, 3),
(3, 1),
(4, NULL),
(4, NULL),
(5, 1),
(5, 3);

INSERT @t2 (data)
VALUES
(1),
(2);
Run Code Online (Sandbox Code Playgroud)

我有一个查询可能正在完成工作,但它看起来有点可怜我也.

WITH t1 AS
(
    SELECT id, data
    FROM @t1
    WHERE data IS NOT NULL
),
t1_count AS
(
    SELECT id, RCount = COUNT(*)
    FROM @t1
    WHERE data IS NOT NULL
    GROUP BY id
)
SELECT t1.id
FROM t1
JOIN t1_count ON t1.id = t1_count.id
FULL JOIN @t2 t2 ON t1.data = t2.data
WHERE t1_count.RCount = (SELECT RCount = COUNT(*) FROM @t2)
GROUP BY t1.id
HAVING COUNT(t1.data) = COUNT(t2.data);
Run Code Online (Sandbox Code Playgroud)

编辑(GarethD的评论):

WITH t1 AS
(
    SELECT
        id,
        data,
        RCount = COUNT(*) OVER(PARTITION BY id)
    FROM @t1
    WHERE data IS NOT NULL
)
SELECT t1.id
FROM t1
FULL JOIN @t2 t2 ON t1.data = t2.data
WHERE t1.RCount = (SELECT RCount = COUNT(*) FROM @t2)
GROUP BY t1.id
HAVING COUNT(t1.data) = COUNT(t2.data);
Run Code Online (Sandbox Code Playgroud)

Hei*_*nzi 4

你想要的是所谓的精确关系除法。不幸的是,SQL Server 没有针对此问题的本机运算符,但这是一个有据可查的问题。一种可能的解决方案(想法取自Joe Celko 的一篇文章)是比较计数,类似于您已经在做的事情:

SELECT t1.id
  FROM @t1 AS t1 LEFT JOIN @t2 AS t2 ON t1.data = t2.data
 GROUP BY t1.id
HAVING COUNT(t1.data) = (SELECT COUNT(data) FROM @t2)
   AND COUNT(t2.data) = (SELECT COUNT(data) FROM @t2);
Run Code Online (Sandbox Code Playgroud)

请注意,这两种HAVING比较都是必要的:

  • 第一个确保 t1 恰好具有所需的行数,并且
  • 第二个确保这些行仅包含 t2 中的值(否则,t2.data通过 LEFT JOIN 将为 NULL。回想一下,COUNT(x) 仅计算 x 的非空值)。