从双引号括起来的记录中删除一个字符

Dee*_*rma 0 sql-server regular-expression t-sql string-splitting

我有一个表,其中每一行都有如下数据:

0150566115,"HEALTH 401K","IC,ON","ICON HEALTH 401K",,,1,08/21/2014
Run Code Online (Sandbox Code Playgroud)

我想要的是删除,包含在双引号“”之间的每个逗号 ( )。然后用逗号 ( ,)分割字符串的其余部分

我不想检查双引号开始和结束的每个字符设置标志。

我可以实现某种正则表达式吗?

有没有简单的方法?

到目前为止,我所尝试的只是根据逗号 ( ,)拆分字符串,但它也在拆分引号内的值。

这是为了达到目的:如何在完整的表中实现这一点(目前,如果我只有一个双引号块实例,它就可以工作)?

Declare @Query nvarchar(max) 

Set @Query = 'Item1,Item2,"Item,Demo,3",New'

Declare @start int, @len int
SELECT @start = PATINDEX('%"%"%', @Query) + 1

print @start

select @len = CHARINDEX('"', SUBSTRING(@Query, @start, LEN(@Query))) - 1

select 
        SUBSTRING(@Query, 1, @start - 2) +
        REPLACE((SUBSTRING(@Query, @start, @len)), ',', '') +
        SUBSTRING(@Query, @start + @len + 1, LEN(@Query))
Run Code Online (Sandbox Code Playgroud)

这是我用来分割的函数

ALTER FUNCTION [dbo].[fnSplit](
    @sInputList VARCHAR(8000) -- List of delimited items
  , @sDelimiter VARCHAR(8000) = ',' -- delimiter that separates items
) RETURNS @List TABLE (id int, item VARCHAR(8000))

BEGIN
DECLARE @sItem VARCHAR(8000)
Declare @ID as int
Set @ID=0
WHILE CHARINDEX(@sDelimiter,@sInputList,0) <> 0
 BEGIN
 SELECT
  @sItem=RTRIM(LTRIM(SUBSTRING(@sInputList,1,CHARINDEX(@sDelimiter,@sInputList,0)-1))),
  @sInputList=RTRIM(LTRIM(SUBSTRING(@sInputList,CHARINDEX(@sDelimiter,@sInputList,0)+LEN(@sDelimiter),LEN(@sInputList))))
 Set @ID=@ID+1
 IF LEN(@sItem) > 0
  INSERT INTO @List SELECT @ID,@sItem
 END

IF LEN(@sInputList) > 0
 INSERT INTO @List SELECT @ID,@sInputList -- Put the last item in
RETURN
END
Run Code Online (Sandbox Code Playgroud)

问题是这是 MS Access 中的一个应用程序,它缩小了我的可能性。我所能做的就是传递将由 SQL Server 导入的文件的名称。

Aar*_*and 7

好吧,这真的很糟糕,但是好吧,如果您甚至拒绝考虑更好的替代方案...首先,创建一个基于集合的拆分函数来跟踪字符串的顺序(您的循环函数确实不是最佳的) :

CREATE FUNCTION [dbo].[SplitStrings_Ordered]
(
    @List       VARCHAR(8000),
    @Delimiter  VARCHAR(255)
)
RETURNS TABLE
AS
    RETURN (SELECT [Index] = ROW_NUMBER() OVER (ORDER BY Number), Item 
    FROM (SELECT Number, Item = SUBSTRING(@List, Number, 
      CHARINDEX(@Delimiter, @List + @Delimiter, Number) - Number)
     FROM (SELECT ROW_NUMBER() OVER (ORDER BY [object_id])
      FROM sys.all_columns) AS n(Number)
      WHERE Number <= CONVERT(INT, LEN(@List))
      AND SUBSTRING(@Delimiter + @List, Number, LEN(@Delimiter)) = @Delimiter
    ) AS y);
Run Code Online (Sandbox Code Playgroud)

(如果您在此数据库中有一个数字表,请使用它代替sys.all_columns, 并将其添加WITH SCHEMABINDING到函数定义中。)

现在,让我们看几个在双引号内嵌入逗号的字符串示例,并在拆分和重新连接之前删除它们:

DECLARE @x TABLE(n VARCHAR(8000));

INSERT @x VALUES
('0150566115,"HEALTH 401K","IC,ON","ICON HEALTH 401K",,,1,08/21/2014'),
('0150566115,HEALTH 401K,"IC,ON","ICON HEALTH 401K",,,1,"08/21/2014"'),
('"01505,66115,","HEALTH 401K","IC,ON","ICON HEALTH 401K",,,1,08/21/2014');

;WITH x AS
(
  SELECT x.n, s.[Index], s = REPLACE(s.Item, ',', 
    CASE s.[Index]%2 WHEN 0 THEN '' ELSE ',' END)
  FROM @x AS x 
  CROSS APPLY dbo.SplitStrings_Ordered(x.n, '"') AS s
)
SELECT x.n, fixed = (SELECT x2.s 
  FROM x AS x2 
  WHERE x2.n = x.n
  ORDER BY [Index]
  FOR XML PATH, TYPE).value(N'.[1]',N'varchar(max)')
FROM x
GROUP BY x.n;
Run Code Online (Sandbox Code Playgroud)

结果在fixed所有三个字符串的列中:

0150566115,HEALTH 401K,ICON,ICON HEALTH 401K,,,1,08/21/2014
0150566115,HEALTH 401K,ICON,ICON HEALTH 401K,,,1,08/21/2014
0150566115,HEALTH 401K,ICON,ICON HEALTH 401K,,,1,08/21/2014
Run Code Online (Sandbox Code Playgroud)

现在,您可以将这些结果反馈给 split 函数,这次使用逗号,具体取决于您的最终目标。该问题似乎只能围绕能够忽略双引号和仅包含在双引号对中的任何逗号来解决。

有关拆分和连接字符串的更多信息:

有关数字表和无循环生成集的更多信息: