use*_*599 6 sql-server sql-server-2008-r2 string-manipulation
我想过滤一个nvarchar
字段以仅返回数值。
我有一些 SQL 可以做到这一点,但它似乎比它需要的要复杂得多。我有兴趣找出是否有人有更好的方法来过滤掉字符串中的任何非数字字符?
IF OBJECT_ID('tempdb..#MOB') IS NOT NULL
BEGIN
DROP Table #MOB
END
SELECT [mob]
INTO #MOB
FROM (
SELECT '(00) 1234 5678' AS [mob]
UNION
SELECT '1234 5678' AS [mob]
UNION
SELECT '+61 012 345 678' AS [mob]
) AS temp
;WITH [fill] ([Num], [Index], [MOBILEPHONE])
AS
(
SELECT
CASE
WHEN [MOBILEPHONE] IS NOT NULL
THEN SUBSTRING([MOBILEPHONE], 1, 1)
ELSE NULL
END AS [Num]
, 1 AS [INDEX], [MOBILEPHONE]
FROM (
SELECT DISTINCT [mob] AS [MOBILEPHONE]
FROM #MOB as t
) AS temp
UNION ALL
SELECT
SUBSTRING([F].[MOBILEPHONE], [F].[Index] + 1, 1) AS [Num]
,[F].[Index] + 1 AS [Index]
, [MOBILEPHONE]
FROM [fill] AS [F]
WHERE ([F].[Index] + 1) < LEN([F].[MOBILEPHONE]) + 1
)
SELECT [E].[MOBILEPHONE] AS [old_MOBILEPHONE],
STUFF((SELECT N'' + [F].[Num]
FROM [fill] AS [F]
WHERE (PATINDEX('%[^0-9]%', [F].[Num]) = 0 OR PATINDEX('%[^0-9]%', [F].[Num]) IS NULL) AND
([F].[MOBILEPHONE] = [E].[MOBILEPHONE])
ORDER BY [F].[MOBILEPHONE], [F].[Index]
FOR XML PATH('')), 1, 0, '')
AS [MOBILEPHONE]
FROM (
SELECT DISTINCT [t].[MOBILEPHONE]
FROM (SELECT [mob] AS [MOBILEPHONE] FROM #MOB) as t
) AS [E]
Run Code Online (Sandbox Code Playgroud)
IF OBJECT_ID('tempdb..#MOB') IS NOT NULL
BEGIN
DROP Table #MOB
END
SELECT [mob]
INTO #MOB
FROM (
SELECT '(00) 1234 5678' AS [mob]
UNION
SELECT '1234 5678' AS [mob]
UNION
SELECT '+61 012 345 678' AS [mob]
) AS temp
;WITH [fill] ([Num], [Index], [MOBILEPHONE])
AS
(
SELECT
CASE
WHEN [MOBILEPHONE] IS NOT NULL
THEN SUBSTRING([MOBILEPHONE], 1, 1)
ELSE NULL
END AS [Num]
, 1 AS [INDEX], [MOBILEPHONE]
FROM (
SELECT DISTINCT [mob] AS [MOBILEPHONE]
FROM #MOB as t
) AS temp
UNION ALL
SELECT
SUBSTRING([F].[MOBILEPHONE], [F].[Index] + 1, 1) AS [Num]
,[F].[Index] + 1 AS [Index]
, [MOBILEPHONE]
FROM [fill] AS [F]
WHERE ([F].[Index] + 1) < LEN([F].[MOBILEPHONE]) + 1
)
SELECT [E].[MOBILEPHONE] AS [old_MOBILEPHONE],
STUFF((SELECT N'' + [F].[Num]
FROM [fill] AS [F]
WHERE (PATINDEX('%[^0-9]%', [F].[Num]) = 0 OR PATINDEX('%[^0-9]%', [F].[Num]) IS NULL) AND
([F].[MOBILEPHONE] = [E].[MOBILEPHONE])
ORDER BY [F].[MOBILEPHONE], [F].[Index]
FOR XML PATH('')), 1, 0, '')
AS [MOBILEPHONE]
FROM (
SELECT DISTINCT [t].[MOBILEPHONE]
FROM (SELECT [mob] AS [MOBILEPHONE] FROM #MOB) as t
) AS [E]
Run Code Online (Sandbox Code Playgroud)
我已经看到 Q & A T-SQL 选择查询以删除Stack Overflow 上的非数字字符,但该答案与我发现的解决方案类似,使用 CTE 表和递归。我正在寻找更简单的东西。希望有我可以创建的自定义排序规则或我可以应用的正则表达式过滤器之类的东西吗?
要使用正则表达式,您需要使用 SQLCLR 函数。Solomon Rutzky创建了一个名为SQLsharp的有用 CLR 函数库。免费版包括几个正则表达式函数,包括RegEx_Replace4k
如下使用:
SELECT
M.mob,
numeric_only =
SQL#.RegEx_Replace4k
(
M.mob, -- Source
N'\D', -- Regular expression
N'', -- Replace matches with empty string
-1, -- Unlimited replacements
1, -- Start at character position
NULL -- Options (see documentation)
)
FROM #MOB AS M;
Run Code Online (Sandbox Code Playgroud)
这会产生如下所示的输出:
SELECT
M.mob,
numeric_only =
SQL#.RegEx_Replace4k
(
M.mob, -- Source
N'\D', -- Regular expression
N'', -- Replace matches with empty string
-1, -- Unlimited replacements
1, -- Start at character position
NULL -- Options (see documentation)
)
FROM #MOB AS M;
Run Code Online (Sandbox Code Playgroud)
对于这个简单的要求,正则表达式有点矫枉过正,因此编写您自己的 CLR 实现以仅删除非数字可能会更快。尽管如此,我发现上面的库函数与最好的 T-SQL 实现一样快,如果不是更快的话(T-SQL 字符串操作相当慢)。
有关 T-SQL 实现,请参阅Dwain Camps根据模式拆分字符串。
下面的“仅数字”函数是由 Eirikur Eiriksson 创建的,它在性能方面很好地打破了大多数 T-SQL-Only 解决方案的大门(100 万次随机字符串的转换,长度从 36 到 72 个字符不等,只需一点点超过 15 秒)。有关测试的其他信息可以在 SQLServerCentral.com 上的以下线程中找到。 http://www.sqlservercentral.com/Forums/Topic1585850-391-2.aspx#bm1629360
CREATE FUNCTION dbo.DigitsOnlyEE
--Created by Eirikur Eiriksson (29 Oct 2014)
(@pString VARCHAR(8000))
RETURNS TABLE WITH SCHEMABINDING AS RETURN
WITH E1(N) AS (SELECT N FROM (VALUES (NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL)) AS X(N))
,Tally(N) AS (SELECT TOP (LEN(@pString)) (ROW_NUMBER() OVER (ORDER BY (SELECT NULL))) AS Num FROM E1 a,E1 b,E1 c,E1 d ORDER BY Num)
SELECT DigitsOnly =
(
SELECT SUBSTRING(@pString,N,1)
FROM Tally
WHERE ((ASCII(SUBSTRING(@pString,N,1)) - 48) & 0x7FFF) < 10
ORDER BY N
FOR XML PATH('')
)
;
Run Code Online (Sandbox Code Playgroud)
它是一个 iTVF(内联表值函数),因此您必须在 FROM 子句中而不是在 SELECT 列表中使用它,如下所示。
SELECT ca.DigitsOnly
FROM dbo.SomeTable
CROSS APPLY dbo.DigitsOnlyEE(SomeString) ca
;
Run Code Online (Sandbox Code Playgroud)
我什至不会加载 SQLSharp 进行测试,因为它似乎安装了证书和用户,因此可能永远不会在生产中使用它。即使我愿意,我也无法进行性能测试,因为作者在他的 EULA 中有以下限制,这也使我无法安装它。我不同意这样的限制,但我会尊重它们。
2.2. 限制 除了第 2.1 节中具体列举的权利外,被许可方及其附属公司和最终用户均不得拥有与软件相关的任何其他权利。作为说明而非限制,上述许可不提供任何权利:
2.2.8. 将软件或文档用于软件的竞争分析、竞争软件产品或服务的开发,或任何其他不利于许可方商业利益的目的。
看看这是否适合您 - 这对我们来说已经成功了。
CREATE FUNCTION [dbo].[RemoveAlphaCharacters] (@Temp NVARCHAR(1000))
RETURNS NVARCHAR(1000)
AS
BEGIN
DECLARE @KeepValues AS NVARCHAR(50)
SET @KeepValues = '%[^0-9]%'
WHILE PatIndex(@KeepValues, @Temp) > 0
SET @Temp = Stuff(@Temp, PatIndex(@KeepValues, @Temp), 1, '')
RETURN @Temp
END
Run Code Online (Sandbox Code Playgroud)
归档时间: |
|
查看次数: |
5496 次 |
最近记录: |