我有一个表“内容”有以下的列:
voter,election_year,election_type,party
我需要删除的组合的所有重复行voter和election_year,和我有麻烦搞清楚如何做到这一点。
我执行以下操作:
WITH CTE AS(
SELECT voter,
election_year,
ROW_NUMBER()OVER(PARTITION BY voter, election_year ORDER BY voter) as RN
FROM votes
)
DELETE
FROM CTE where RN>1
Run Code Online (Sandbox Code Playgroud)
基于另一个StackOverflow答案,但似乎这是特定于SQL Server的。我已经看到了使用唯一ID来执行此操作的方法,但是此特定表没有那么豪华。如何采用上述脚本删除需要的重复项?谢谢!
编辑:根据请求,创建带有一些示例数据的表:
CREATE TABLE public.votes
(
voter varchar(10),
election_year smallint,
election_type varchar(2),
party varchar(3)
);
INSERT INTO votes
(voter, election_year, election_type, party)
VALUES
('2435871347', 2018, 'PO', 'EV'),
('2435871347', 2018, 'RU', 'EV'),
('2435871347', 2018, 'GE', 'EV'),
('2435871347', 2016, 'PO', 'EV'),
('2435871347', 2016, 'GE', 'EV'),
('10215121/8', 2016, 'GE', 'ED')
;
Run Code Online (Sandbox Code Playgroud)
这是一个选择
DELETE FROM votes T1
USING votes T2
WHERE T1.ctid < T2.ctid
AND T1.voter = T2.voter
AND T1.election_year = T2.election_year;
Run Code Online (Sandbox Code Playgroud)
参见http://sqlfiddle.com/#!15/4d45d/5
从Postgres中删除或更新CTE无效,请参见“ PostgreSQL with-delete“ relationship不存在””的公认答案。
由于您没有主键,因此您可以(ab)使用ctid伪列来标识要删除的行。
WITH
cte
AS
(
SELECT ctid,
row_number() OVER (PARTITION BY voter,
election_year
ORDER BY voter) rn
FROM votes
)
DELETE FROM votes
USING cte
WHERE cte.rn > 1
AND cte.ctid = votes.ctid;
Run Code Online (Sandbox Code Playgroud)
并可能考虑引入主键。