将SQL表与自身进行比较(自联接)

Kyl*_*yle 8 sql join self-join

我正在尝试根据混合列找到重复的行.这是我的一个例子:

CREATE TABLE Test
(
   id INT PRIMARY KEY,
   test1 varchar(124),
   test2 varchar(124)
)

INSERT INTO TEST ( id, test1, test2 ) VALUES ( 1, 'A', 'B' )
INSERT INTO TEST ( id, test1, test2 ) VALUES ( 2, 'B', 'C' )
Run Code Online (Sandbox Code Playgroud)

现在,如果我运行此查询:

SELECT [LEFT].[ID] 
FROM [TEST] AS [LEFT] 
   INNER JOIN [TEST] AS [RIGHT] 
   ON [LEFT].[ID] != [RIGHT].[ID] 
WHERE [LEFT].[TEST1] = [RIGHT].[TEST2]
Run Code Online (Sandbox Code Playgroud)

我希望能找回两个id.(1和2),但我只回到了一排.

我的想法是它应该比较每一行,但我想这不正确?为了解决这个问题,我将查询更改为:

SELECT [LEFT].[ID] 
FROM [TEST] AS [LEFT] 
   INNER JOIN [TEST] AS [RIGHT] 
   ON [LEFT].[ID] != [RIGHT].[ID] 
WHERE [LEFT].[TEST1] = [RIGHT].[TEST2] 
OR [LEFT].[TEST2] = [RIGHT].[TEST1]
Run Code Online (Sandbox Code Playgroud)

这给了我两行,但性能根据行数极快地降低.

我为性能和结果找到的最终解决方案是使用联合:

SELECT [LEFT].[ID] 
FROM [TEST] AS [LEFT] 
   INNER JOIN [TEST] AS [RIGHT] 
   ON [LEFT].[ID] != [RIGHT].[ID] 
WHERE [LEFT].[TEST1] = [RIGHT].[TEST2] 
UNION
SELECT [LEFT].[ID] 
FROM [TEST] AS [LEFT] 
   INNER JOIN [TEST] AS [RIGHT] 
   ON [LEFT].[ID] != [RIGHT].[ID] 
WHERE [LEFT].[TEST2] = [RIGHT].[TEST1]
Run Code Online (Sandbox Code Playgroud)

但总的来说,我显然不理解为什么这不起作用,这意味着我可能做错了什么.有人能指出我正确的方向吗?

Aar*_*ght 11

不要加入不平等; 似乎JOIN和WHERE条件被反转.

SELECT t1.id
FROM Test t1
INNER JOIN Test t2
ON ((t1.test1 = t2.test2) OR (t1.test2 = t2.test1))
WHERE t1.id <> t2.id
Run Code Online (Sandbox Code Playgroud)

应该工作正常.

  • 实际上,现在我考虑一下,因为你的模式似乎没有有用的索引,我发布的查询将执行与inequality-join查询相同的操作; 无论你做什么,你最终都会进行两次完整的聚簇索引扫描,这很可怕.您需要覆盖(test1,test2)和(test2,test1)上的索引以获得更好的性能. (2认同)

Kla*_*sen 5

如果您选择它们​​,您只能取回两个ID:

SELECT [LEFT].[ID], [RIGHT].[ID] 
FROM [TEST] AS [LEFT] 
   INNER JOIN [TEST] AS [RIGHT] 
   ON [LEFT].[ID] != [RIGHT].[ID] 
WHERE [LEFT].[TEST1] = [RIGHT].[TEST2]
Run Code Online (Sandbox Code Playgroud)

只获得一个ROW的原因是只有一行(即第2行)的TEST1等于另一行的TEST2.