MrV*_*mes 3 sql-server-2005 sql-server t-sql pattern-matching substring
我一直在尝试编写一个函数来检查一个字符串是否包含一个数字,而该数字不是更大数字的一部分(换句话说,如果要搜索的数字是 '6' 而字符串是 '7+16+2'它应该返回 false,因为这个字符串中的 '6' 是数字 '16' 的一部分)
我写了下面的函数(它很长,但我打算在重构之前先测试它)
在测试时,我发现了一个错误,它仅通过逻辑运行找到的数字的第一个实例。因此,使用 '6' 对 '16+7+9+6' 运行此函数将返回 false,因为它确定第一个 '6' 是更大数字的一部分并停止处理。
我认为要解决这个问题,我必须实现一个循环来缩短 'haystack' 字符串(这样,使用示例 '16+7+9+6',该函数在消除后继续检查 '+7+9+6'第一个“6”)但在花时间使已经复杂的函数变得更加复杂之前,我想检查是否有更简单的方法来实现相同的目标?
drop function dbo.runners_contain_runner
go
create function dbo.runners_contain_runner(@runner varchar(max), @runners varchar(max))
returns int
as
begin
/*
eliminate the plus sign from @runners so that the
'isnumeric' function doesn't return false positives (it returns 1 for '+')
*/
set @runners = replace(@runners,'+','_' )
declare @ret int;
set @ret = 0;
-- if the runner is the only runner return 1
if @runners = @runner
set @ret = 1
else
begin
declare @charindex int;
set @charindex = charindex(@runner,@runners)
if @charindex > 0
begin
-- if it is at the beginning then check the char after it
if @charindex = 1
begin
if isnumeric(substring(@runners,@charindex + len(@runner),1)) = 0
set @ret = @charindex
end
-- if it is at the end then check the char before it
else if @charindex = len(@runners) - (len(@runner) - 1)
begin
if isnumeric(substring(@runners,@charindex - 1,1)) = 0
set @ret = @charindex
end
-- if it is in the middle check the chars either side of it
else
begin
if isnumeric(substring(@runners,@charindex - 1,1)) +
isnumeric(substring(@runners,@charindex + len(@runner),1)) = 0
set @ret = @charindex
end
end
end
return @ret
end
Run Code Online (Sandbox Code Playgroud)
也许您过于关注想要一个数字,从而使问题变得过于复杂。退后一步。你真正想要的是一个两边没有任何数字的子字符串。一个数字可以成为更大数字的一部分的唯一方法是在它的两侧至少有 1 位数字,对吗?所以只要你只传递数字,那么这个定义仍然应该产生两边都没有任何数字的数字。
考虑到这一点,我们只需要 3 个PATINDEX
谓词来覆盖最左侧、最右侧或中间的传入值。尝试以下操作,因为它似乎有效:
GO
CREATE PROCEDURE #TestFindRunner
(
@Runner VARCHAR(10)
)
AS
SET NOCOUNT ON;
DECLARE @Data TABLE
(
[ID] INT NOT NULL PRIMARY KEY,
[Runners] VARCHAR(50) NULL
);
INSERT INTO @Data ([ID], [Runners]) VALUES (1, '16+7+9+6');
INSERT INTO @Data ([ID], [Runners]) VALUES (2, '16+7+9+5');
INSERT INTO @Data ([ID], [Runners]) VALUES (3, '26+77+9+5');
INSERT INTO @Data ([ID], [Runners]) VALUES (4, '6+3+45');
INSERT INTO @Data ([ID], [Runners]) VALUES (5, '63,808,111,92');
INSERT INTO @Data ([ID], [Runners]) VALUES (6, '1-7-9,6');
INSERT INTO @Data ([ID], [Runners]) VALUES (7, '1-6-9,7');
INSERT INTO @Data ([ID], [Runners]) VALUES (8, '1-7-9,63');
INSERT INTO @Data ([ID], [Runners]) VALUES (9, '1-63-9,7');
INSERT INTO @Data ([ID], [Runners]) VALUES (10, NULL);
INSERT INTO @Data ([ID], [Runners]) VALUES (11, '6');
SELECT tmp.*
FROM @Data tmp
WHERE @Runner COLLATE Latin1_General_100_BIN2 = tmp.[Runners]
OR PATINDEX('%[^0123456789]' + @Runner COLLATE Latin1_General_100_BIN2,
tmp.[Runners]) > 0
OR PATINDEX(@Runner + '[^0123456789]%' COLLATE Latin1_General_100_BIN2,
tmp.[Runners]) > 0
OR PATINDEX('%[^0123456789]' + @Runner + '[^0123456789]%'
COLLATE Latin1_General_100_BIN2, tmp.[Runners]) > 0
GO
Run Code Online (Sandbox Code Playgroud)
然后测试:
EXEC #TestFindRunner 0;
EXEC #TestFindRunner 2;
EXEC #TestFindRunner 4;
EXEC #TestFindRunner 8;
EXEC #TestFindRunner 11;
-- 0 rows
EXEC #TestFindRunner 3; -- 4
EXEC #TestFindRunner 77; -- 3
EXEC #TestFindRunner 111; -- 5
-- 1 row
EXEC #TestFindRunner 5; -- 2 and 3
-- 2 rows
EXEC #TestFindRunner 1; -- 6, 7, 8, and 9
-- 4 rows
EXEC #TestFindRunner 6; -- 1, 4, 6, 7, and 11
-- 5 rows
EXEC #TestFindRunner 7; -- 1, 2, 6, 7, 8, and 9
-- 6 rows
EXEC #TestFindRunner 9; -- 1, 2, 3, 6, 7, 8, and 9
-- 7 rows
Run Code Online (Sandbox Code Playgroud)
有 3 种变体的原因PATINDEX
是PATINDEX
搜索模式不是正则表达式 (RegeEx),这与许多人所说/认为的相反(与LIKE
模式相同)。PATINDEX
和LIKE
模式没有量词,因此无法指定[^0123456789]
单个字符替换应该为“0 或更多”;它是“一个且只有一个;不多也不少”。
强制二进制排序规则(即COLLATE Latin1_General_100_BIN2
每个@Runner
引用之后)确保我们只处理这 10 个十进制数字,而不是任何其他可能被视为等效的字符
要将上述逻辑放入内联表值函数 (TVF) 以使其更易于使用(并且比类似易于使用的标量 UDF 更高效),请尝试以下操作:
USE [tempdb];
GO
CREATE FUNCTION dbo.IsRunnerPresent
(
@Runner VARCHAR(10),
@Runners VARCHAR(8000)
)
RETURNS TABLE
WITH SCHEMABINDING
AS RETURN
SELECT CONVERT(BIT,
CASE WHEN @Runner COLLATE Latin1_General_100_BIN2 = @Runners
OR PATINDEX('%[^0123456789]' + @Runner
COLLATE Latin1_General_100_BIN2, @Runners) > 0
OR PATINDEX(@Runner + '[^0123456789]%'
COLLATE Latin1_General_100_BIN2, @Runners) > 0
OR PATINDEX('%[^0123456789]' + @Runner + '[^0123456789]%'
COLLATE Latin1_General_100_BIN2, @Runners) > 0
THEN 1
ELSE 0
END) AS [RunnerFound];
GO
Run Code Online (Sandbox Code Playgroud)
然后测试:
DECLARE @Runner VARCHAR(10);
SET @Runner = '6';
DECLARE @Data TABLE
(
[ID] INT NOT NULL PRIMARY KEY,
[Runners] VARCHAR(50) NULL
);
INSERT INTO @Data ([ID], [Runners]) VALUES (1, '16+7+9+6');
INSERT INTO @Data ([ID], [Runners]) VALUES (2, '16+7+9+5');
INSERT INTO @Data ([ID], [Runners]) VALUES (3, '26+77+9+5');
INSERT INTO @Data ([ID], [Runners]) VALUES (4, '6+3+45');
INSERT INTO @Data ([ID], [Runners]) VALUES (5, '63,808,111,92');
INSERT INTO @Data ([ID], [Runners]) VALUES (6, '1-7-9,6');
INSERT INTO @Data ([ID], [Runners]) VALUES (7, '1-6-9,7');
INSERT INTO @Data ([ID], [Runners]) VALUES (8, '1-7-9,63');
INSERT INTO @Data ([ID], [Runners]) VALUES (9, '1-63-9,7');
INSERT INTO @Data ([ID], [Runners]) VALUES (10, NULL);
INSERT INTO @Data ([ID], [Runners]) VALUES (11, '6');
SELECT tmp.[ID],
tmp.[Runners],
fnd.[RunnerFound]
FROM @Data tmp
CROSS APPLY dbo.IsRunnerPresentTVF(@Runner, tmp.[Runners]) fnd;
Run Code Online (Sandbox Code Playgroud)
返回:
GO
CREATE PROCEDURE #TestFindRunner
(
@Runner VARCHAR(10)
)
AS
SET NOCOUNT ON;
DECLARE @Data TABLE
(
[ID] INT NOT NULL PRIMARY KEY,
[Runners] VARCHAR(50) NULL
);
INSERT INTO @Data ([ID], [Runners]) VALUES (1, '16+7+9+6');
INSERT INTO @Data ([ID], [Runners]) VALUES (2, '16+7+9+5');
INSERT INTO @Data ([ID], [Runners]) VALUES (3, '26+77+9+5');
INSERT INTO @Data ([ID], [Runners]) VALUES (4, '6+3+45');
INSERT INTO @Data ([ID], [Runners]) VALUES (5, '63,808,111,92');
INSERT INTO @Data ([ID], [Runners]) VALUES (6, '1-7-9,6');
INSERT INTO @Data ([ID], [Runners]) VALUES (7, '1-6-9,7');
INSERT INTO @Data ([ID], [Runners]) VALUES (8, '1-7-9,63');
INSERT INTO @Data ([ID], [Runners]) VALUES (9, '1-63-9,7');
INSERT INTO @Data ([ID], [Runners]) VALUES (10, NULL);
INSERT INTO @Data ([ID], [Runners]) VALUES (11, '6');
SELECT tmp.*
FROM @Data tmp
WHERE @Runner COLLATE Latin1_General_100_BIN2 = tmp.[Runners]
OR PATINDEX('%[^0123456789]' + @Runner COLLATE Latin1_General_100_BIN2,
tmp.[Runners]) > 0
OR PATINDEX(@Runner + '[^0123456789]%' COLLATE Latin1_General_100_BIN2,
tmp.[Runners]) > 0
OR PATINDEX('%[^0123456789]' + @Runner + '[^0123456789]%'
COLLATE Latin1_General_100_BIN2, tmp.[Runners]) > 0
GO
Run Code Online (Sandbox Code Playgroud)
归档时间: |
|
查看次数: |
2420 次 |
最近记录: |