在Oracle中创建名字的唯一字符串

jbr*_*k10 2 oracle plsql

我可以通过编程方式执行此操作,但正在寻找更清晰的解决方案.

假设我有下表:

First Name      Last Name
Smith           Albert       
Smith           Alphonse    
Smith           Jason         
Johnson         Charles
Roberts         Chris
Roberts         Christian
Run Code Online (Sandbox Code Playgroud)

我想用以下规则创建一个独特的

  • 如果姓氏已经是唯一的,则只返回姓氏
  • 如果相同的姓氏首先返回首字母(或更多),然后是句点,则返回姓氏

对于艾伯特史密斯,我会回到Alb.Smith
对于查尔斯约翰逊,我会回归约翰逊
克里斯蒂安罗伯茨我会回归基督.罗伯茨

有没有人对如何在Oracle SQL语句中直接完成此任务有任何想法,还是应该坚持在程序中执行此操作?

Ale*_*ole 6

具有递归子查询重构(CTE)的版本,需要11gR2:

with t (last_name, first_name, orig_rn, part, part_length, remaining) as (
  select last_name, first_name,
    row_number() over (order by last_name, first_name),
    cast (null as varchar2(20)), 0, length(first_name)
  from t42
  union all
  select last_name, first_name, orig_rn,
    part || substr(first_name, part_length + 1, 1),
    part_length + 1,
    remaining - 1
  from t
  where remaining > 0
),
u as (
  select last_name, first_name, orig_rn, part, part_length,
    count(distinct orig_rn) over (partition by last_name) as last_name_count,
    count(distinct orig_rn) over (partition by last_name, part) as part_count
  from t
),
v as (
  select last_name, first_name, orig_rn, part, last_name_count,
  row_number() over (partition by orig_rn order by part_length) as rn
  from u
  where (part_count = 1 or part = first_name)
)
select case when last_name_count = 1 then null
  when part = first_name then first_name || ' '
  else part || '. '
  end || last_name as condendsed_name
from v
where rn = 1
order by orig_rn;
Run Code Online (Sandbox Code Playgroud)

这使:

CONDENSED_NAME                               
----------------------------------------------
Johnson                                        
Chris Roberts                                  
Christ. Roberts                                
Alb. Smith                                     
Alp. Smith                                     
J. Smith                                       
Run Code Online (Sandbox Code Playgroud)

SQL小提琴.

tCTE是递归的.它从原始表行开始,并为第一个名称的每个可能收缩生成其他行:

with t (last_name, first_name, orig_rn, part, part_length, remaining) as (
  select last_name, first_name,
    row_number () over (order by last_name, first_name),
    cast (null as varchar2(20)), 0, length(first_name)
  from t42
  union all
  select last_name, first_name, orig_rn,
    part || substr(first_name, part_length + 1, 1),
    part_length + 1,
    remaining - 1
  from t
  where remaining > 0
)
select last_name, first_name, part
from t
where last_name = 'Johnson'
order by orig_rn, part_length;

LAST_NAME            FIRST_NAME           PART                   
-------------------- -------------------- ------------------------
Johnson              Charles                                       
Johnson              Charles              C                        
Johnson              Charles              Ch                       
Johnson              Charles              Cha                      
Johnson              Charles              Char                     
Johnson              Charles              Charl                    
Johnson              Charles              Charle                   
Johnson              Charles              Charles                  
Run Code Online (Sandbox Code Playgroud)

下一个CTE u(是的,对于名称很抱歉,我没有灵感)比较所有行的值并计算出现次数.任何有计数的东西1都是独一无二的.

...
u as (
  select last_name, first_name, orig_rn, part, part_length,
    count(distinct orig_rn) over (partition by last_name) as last_name_count,
    count(distinct orig_rn) over (partition by last_name, part) as part_count
  from t
)
select last_name, first_name, part, last_name_count, part_count
from u
where last_name = 'Roberts'
order by orig_rn, part_length;

LAST_NAME            FIRST_NAME           PART                     LAST_NAME_COUNT PART_COUNT
-------------------- -------------------- ------------------------ --------------- ----------
Roberts              Chris                                                       2          2 
Roberts              Chris                C                                      2          2 
Roberts              Chris                Ch                                     2          2 
Roberts              Chris                Chr                                    2          2 
Roberts              Chris                Chri                                   2          2 
Roberts              Chris                Chris                                  2          2 
Roberts              Christian                                                   2          2 
Roberts              Christian            C                                      2          2 
Roberts              Christian            Ch                                     2          2 
Roberts              Christian            Chr                                    2          2 
Roberts              Christian            Chri                                   2          2 
Roberts              Christian            Chris                                  2          2 
Roberts              Christian            Christ                                 2          1 
Roberts              Christian            Christi                                2          1 
Roberts              Christian            Christia                               2          1 
Roberts              Christian            Christian                              2          1 
Run Code Online (Sandbox Code Playgroud)

第三个CTE v只查看唯一的CTE ,然后根据唯一值的长度对它们进行排序; 因此,对于所有记录中唯一的记录的第一个名称的最短收缩被排名为1.

...
v as (
  select last_name, first_name, orig_rn, part, last_name_count,
  row_number() over (partition by orig_rn order by part_length) as rn
  from u
  where (part_count = 1 or part = first_name)
)
select last_name, first_name, part, last_name_count
from v
where rn = 1
order by orig_rn;

LAST_NAME            FIRST_NAME           PART                     LAST_NAME_COUNT
-------------------- -------------------- ------------------------ ---------------
Johnson              Charles                                                     1 
Roberts              Chris                Chris                                  2 
Roberts              Christian            Christ                                 2 
Smith                Albert               Alb                                    3 
Smith                Alphonse             Alp                                    3 
Smith                Jason                J                                      3 
Run Code Online (Sandbox Code Playgroud)

然后,最终查询只提取排名的那些1,这是最短的唯一值,并按照您想要的方式格式化它们.

如果两个人的名字完全相同,那么两者都是完整的拼写(演示),这似乎是你想要的评论.

不确定这是否真的有资格作为'清洁',除了它只能击中原始表一次.