如何基于多个字段删除SQL表中的重复项

cfr*_*ich 24 mysql sql duplicate-removal

我有一个游戏表,描述如下:

+---------------+-------------+------+-----+---------+----------------+
| Field         | Type        | Null | Key | Default | Extra          |
+---------------+-------------+------+-----+---------+----------------+
| id            | int(11)     | NO   | PRI | NULL    | auto_increment |
| date          | date        | NO   |     | NULL    |                |
| time          | time        | NO   |     | NULL    |                |
| hometeam_id   | int(11)     | NO   | MUL | NULL    |                |
| awayteam_id   | int(11)     | NO   | MUL | NULL    |                |
| locationcity  | varchar(30) | NO   |     | NULL    |                |
| locationstate | varchar(20) | NO   |     | NULL    |                |
+---------------+-------------+------+-----+---------+----------------+
Run Code Online (Sandbox Code Playgroud)

但是每个游戏在某个地方的表格中都有重复的条目,因为每个游戏都在两个团队的时间表中.是否有一个sql语句我可以用来查看和删除所有重复项基于相同的日期,时间,hometeam_id,awayteam_id,locationcity和locationstate字段?

N W*_*est 45

您应该能够执行相关子查询来删除数据.找到所有重复的行并删除除id之外的所有行.对于MYSQL,需要使用内连接(功能等同于EXISTS),如下所示:

delete games from games inner join 
    (select  min(id) minid, date, time,
             hometeam_id, awayteam_id, locationcity, locationstate
     from games 
     group by date, time, hometeam_id, 
              awayteam_id, locationcity, locationstate
     having count(1) > 1) as duplicates
   on (duplicates.date = games.date
   and duplicates.time = games.time
   and duplicates.hometeam_id = games.hometeam_id
   and duplicates.awayteam_id = games.awayteam_id
   and duplicates.locationcity = games.locationcity
   and duplicates.locationstate = games.locationstate
   and duplicates.minid <> games.id)
Run Code Online (Sandbox Code Playgroud)

测试,替换delete games from gamesselect * from games.不要只在你的数据库上运行删除:-)


Gri*_*yan 13

你可以尝试这样的查询:

DELETE FROM table_name AS t1
WHERE EXISTS (
 SELECT 1 FROM table_name AS t2 
 WHERE t2.date = t1.date 
 AND t2.time = t1.time 
 AND t2.hometeam_id = t1.hometeam_id 
 AND t2.awayteam_id = t1.awayteam_id 
 AND t2.locationcity = t1.locationcity 
 AND t2.id > t1.id )
Run Code Online (Sandbox Code Playgroud)

这将在数​​据库中仅留下具有最​​小id的每个游戏实例的一个示例.


Ali*_*emi 7

对我有用的最好的事情是重新创建表格.

CREATE TABLE newtable SELECT * FROM oldtable GROUP BY field1,field2;
Run Code Online (Sandbox Code Playgroud)

然后,您可以重命名.

  • 这是迄今为止最好,更直接的解决方案.使用它你不会出错. (2认同)

小智 5

获取重复的列表匹配两个字段

select t.ID, t.field1, t.field2
from (
  select field1, field2
  from table_name
  group by field1, field2
  having count(*) > 1) x, table_name t
where x.field1 = t.field1 and x.field2 = t.field2
order by t.field1, t.field2
Run Code Online (Sandbox Code Playgroud)

并删除所有重复

DELETE x 
FROM table_name x
JOIN table_name y
ON y.field1= x.field1
AND y.field2 = x.field2
AND y.id < x.id;
Run Code Online (Sandbox Code Playgroud)