Huz*_*mir 5 sql t-sql sql-server sql-server-2008
我正在尝试使用 partition by 子句对数据库中的记录进行重复数据删除 这是我为了进行重复数据删除而运行的查询。它对人口最多的记录进行排名,并保留排名最高的记录。
WITH cteDupes AS
(
--
-- Partition based on contact.owner and email
SELECT ROW_NUMBER() OVER(PARTITION BY contactowner, email
ORDER BY
-- ranking by populated field
case when otherstreet is not null then 1 else 0 end +
case when othercity is not null then 1 else 0 end
) AS RND, *
FROM scontact
where (contact_owner_name__c is not null and contact_owner_name__c<>'') and (email is not null and email<>'')
)
--Rank data and place it into a new table created
select * into contact_case1
from cteDupes
WHERE RND=1;
Run Code Online (Sandbox Code Playgroud)
我想知道是否可以通过用例进行分区。例如,目前我正在按联系人所有者和电子邮件进行分区。当contactowner 为空时,我想改为按contactofficer 进行分区。我可以创建这样的案例陈述吗,或者这是不可能的,因为排名会以某种方式改变。
您可以使用case,但我认为coalesce()在这种情况下更简单:
SELECT ROW_NUMBER() OVER (PARTITION BY COALESCE(contactowner, contactofficer), email
. . .
Run Code Online (Sandbox Code Playgroud)
如果您希望将同名的联系人和官员分开计算,那么您可以这样做:
SELECT ROW_NUMBER() OVER (PARTITION BY (CASE WHEN contactowner is NULL then 1 else 2 end),
contactowner,
(CASE WHEN contactowner is null THEN contactofficer END),
email
. . .
Run Code Online (Sandbox Code Playgroud)