从SQL Join中删除重复项

Ham*_*han 18 sql join

以下是一个假设的情况,这接近我的实际问题.表格1

recid   firstname    lastname   company
1       A             B          AAA
2       D             E          DEF
3       G             H          IJK
4       A             B          ABC
Run Code Online (Sandbox Code Playgroud)

我有一个看起来像这样的table2

recid   firstname    lastname   company
10      A             B          ABC
20      D             E          DEF
30      M             D          DIM
40      A             B          CCC
Run Code Online (Sandbox Code Playgroud)

现在,如果我在recid上加入表,它将给出0结果,因为recid是唯一的,所以不会有重复.但是如果我加入firstname和lastname列,它们不是唯一的并且有重复项,我会在内部联接上获得重复项.我在连接时添加的列越多,它就越糟糕(创建更多重复项).

在上面的简单情况下,如何删除以下查询中的重复项.我想比较firstname和lastname,如果匹配,我返回firstname,lastname和recid from table2

select distinct * from
(select recid, first, last from table1) a
inner join
(select recid, first,last from table2) b
on a.first = b.first
Run Code Online (Sandbox Code Playgroud)

如果有人想在将来玩它,脚本就在这里

create table table1 (recid int not null primary key, first varchar(20), last varchar(20), company varchar(20))
create table table2 (recid int not null primary key, first varchar(20), last varchar(20), company varchar(20))

insert into table1 values(1,'A','B','ABC')
insert into table1 values(2,'D','E','DEF')
insert into table1 values(3,'M','N','MNO')
insert into table1 values(4,'A','B','ABC')

insert into table2 values(10,'A','B','ABC')
insert into table2 values(20,'D','E','DEF')
insert into table2 values(30,'Q','R','QRS')
insert into table2 values(40,'A','B','ABC')
Run Code Online (Sandbox Code Playgroud)

Cod*_*ian 21

你本身不想进行连接,你只是在测试存在/集包含.

我不知道你编写的SQL目前的风格,但这应该有效.

SELECT MAX(recid), firstname, lastname 
FROM table2 T2
WHERE EXISTS (SELECT * FROM table1 WHERE firstname = T2.firstame AND lastname = T2.lastname)
GROUP BY lastname, firstname
Run Code Online (Sandbox Code Playgroud)

如果要实现连接,请保持代码大致相同:

SELECT max(t2.recid), t2.firstame, t2.lastname 
FROM Table2 T2 
INNER JOIN Table1 T1 
    ON T2.firstname = t1.firstname and t2.lastname = t1.lastname
GROUP BY t2.firstname, t2.lastname 
Run Code Online (Sandbox Code Playgroud)

根据DBMS,内部联接可以与Exists(半联接vs联接)不同地实现,但优化器有时可以解决它并选择正确的运算符,无论您编写它的方式如何.


sll*_*sll 5

SELECT t2.recid, t2.first, t2.last 
FROM  table1 t1
INNER JOIN table2 t2 ON t1.first = t2.first AND t1.last = t2.last
GROUP BY t2.recid, t2.first, t2.last
Run Code Online (Sandbox Code Playgroud)

编辑:添加图片

在此输入图像描述