删除列中的重复文本

ich*_*han 1 sql t-sql sql-server

在我的临时表中,我有一列可以重复的电子邮件地址列表.例如:

Row#1: test@gmail.com; test@gmail.com; test@yahoo.com; abc@gmail.com
Row#2: abc@yahoo.com; abcde@yahoo.com; abcde@yahoo.com
Run Code Online (Sandbox Code Playgroud)

期望的结果:

Row#1: test@gmail.com; test@yahoo.com; abc@gmail.com
Row#2: abc@yahoo.com; abcde@yahoo.com
Run Code Online (Sandbox Code Playgroud)

有没有办法在SQL Server语言中实现这一点?

Lam*_*mak 7

好吧,假设SQL Server 2017,并且您有一个键列(或列的组合),您可以使用STRING_SPLITSTRING_AGG:

WITH CTE AS
(
    SELECT  DISTINCT 
                T.KeyColumn,
                E.Value Email
    FROM dbo.YourTable T
    OUTER APPLY STRING_SPLIT(Email,';') E
)
SELECT  KeyColumn,
        STRING_AGG(Email,';') Email
FROM CTE
GROUP BY KeyColumn
;
Run Code Online (Sandbox Code Playgroud)

SQL Server 2016的更新:

没有STRING_AGG你必须使用旧的方式; 例如:

WITH CTE AS
(
    SELECT  DISTINCT 
                T.KeyColumn,
                E.Value Email
    FROM dbo.YourTable T
    OUTER APPLY STRING_SPLIT(Email,';') E
)
SELECT  t.KeyColumn,
        Email = STUFF(( SELECT ';' + CONVERT(varchar(255),Email)
                        FROM CTE
                        WHERE KeyColumn = t.KeyColumn
                        FOR XML PATH(''), TYPE).value('.[1]','nvarchar(max)'),1,1,'')
FROM CTE t
GROUP BY t.KeyColumn
;
Run Code Online (Sandbox Code Playgroud)