tex*_*uce 4 sql oracle window-functions
我需要找到重复记录(具有主记录ID和重复记录ID):
select ciid, name from (
select ciid, name, row_number() over (
partition by related_id, name order by updatedate desc) rn
) where rn = 1;
Run Code Online (Sandbox Code Playgroud)
这为我提供了主记录ID,但它也包含没有重复的记录.
如果我使用
select ciid, name from (
select ciid, name, row_number() over (
partition by related_id, name order by updatedate desc) rn
) where rn > 1;
Run Code Online (Sandbox Code Playgroud)
这会获取所有重复记录,但不会记录主记录.
我希望我做的事情如下:
select ciid, name from (
select ciid, name, row_number() over (
partition by related_id, name order by updatedate desc
) rn, count(*) over (
partition by related_id, name order by updatedate desc
) cnt
) where rn = 1 and cnt > 1;
Run Code Online (Sandbox Code Playgroud)
但我担心性能,甚至是它实际上做了我想要的.
如何仅为具有重复项的主记录获取主记录?请注意,这name不是唯一的列.只有ciid独一无二.
我最后在我的问题中使用了类似的查询:
select ciid, name from (
select ciid, name, row_number() over (
partition by related_id, name order by updatedate desc
) rn, count(*) over (
partition by related_id, name desc
) cnt
) where rn = 1 and cnt > 1;
Run Code Online (Sandbox Code Playgroud)
工作得非常好.主记录是rn = 1,重复是rn> 1.确保count(*) over (partition ..)不能有order by子句.
我还没有测试过这个(因为我没有真实的数据,而且懒得创建一些),但似乎这些方面的东西可能会起作用:
with has_duplicates as (
select related_id, name
from yourtable
group by related_id, name
having count (*) > 1
),
with_dupes as (
select
y.ccid, y.name,
row_number() over (partition by y.related_id, y.name order by y.updatedate desc) rn
from
yourtable y,
has_duplicates d
where
y.related_id = d.related_id and
y.name = d.name
)
select
ccid, name
from with_dupes
where rn = 1
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
5222 次 |
| 最近记录: |