Eve*_*ien 161 sql-server parsing alphanumeric user-defined-functions alphabetic
你怎么能删除字符串中不是字母的所有字符?
那么非字母数字呢?
这是必须是自定义功能还是还有更通用的解决方案?
G M*_*ros 349
试试这个功能:
Create Function [dbo].[RemoveNonAlphaCharacters](@Temp VarChar(1000))
Returns VarChar(1000)
AS
Begin
Declare @KeepValues as varchar(50)
Set @KeepValues = '%[^a-z]%'
While PatIndex(@KeepValues, @Temp) > 0
Set @Temp = Stuff(@Temp, PatIndex(@KeepValues, @Temp), 1, '')
Return @Temp
End
Run Code Online (Sandbox Code Playgroud)
像这样称呼它:
Select dbo.RemoveNonAlphaCharacters('abc1234def5678ghi90jkl')
Run Code Online (Sandbox Code Playgroud)
一旦理解了代码,就应该看到更改它以删除其他字符相对简单.你甚至可以使这个动态足以传递你的搜索模式.
希望能帮助到你.
Eve*_*ien 160
CREATE FUNCTION [dbo].[fn_StripCharacters]
(
@String NVARCHAR(MAX),
@MatchExpression VARCHAR(255)
)
RETURNS NVARCHAR(MAX)
AS
BEGIN
SET @MatchExpression = '%['+@MatchExpression+']%'
WHILE PatIndex(@MatchExpression, @String) > 0
SET @String = Stuff(@String, PatIndex(@MatchExpression, @String), 1, '')
RETURN @String
END
Run Code Online (Sandbox Code Playgroud)
仅限字母:
SELECT dbo.fn_StripCharacters('a1!s2@d3#f4$', '^a-z')
Run Code Online (Sandbox Code Playgroud)
仅限数字:
SELECT dbo.fn_StripCharacters('a1!s2@d3#f4$', '^0-9')
Run Code Online (Sandbox Code Playgroud)
仅限字母数字:
SELECT dbo.fn_StripCharacters('a1!s2@d3#f4$', '^a-z0-9')
Run Code Online (Sandbox Code Playgroud)
非字母数字:
SELECT dbo.fn_StripCharacters('a1!s2@d3#f4$', 'a-z0-9')
Run Code Online (Sandbox Code Playgroud)
我知道 SQL 不擅长字符串操作,但我不认为这会如此困难。这是一个从字符串中去除所有数字的简单函数。会有更好的方法来做到这一点,但这是一个开始。
CREATE FUNCTION dbo.AlphaOnly (
@String varchar(100)
)
RETURNS varchar(100)
AS BEGIN
RETURN (
REPLACE(
REPLACE(
REPLACE(
REPLACE(
REPLACE(
REPLACE(
REPLACE(
REPLACE(
REPLACE(
REPLACE(
@String,
'9', ''),
'8', ''),
'7', ''),
'6', ''),
'5', ''),
'4', ''),
'3', ''),
'2', ''),
'1', ''),
'0', '')
)
END
GO
-- ==================
DECLARE @t TABLE (
ColID int,
ColString varchar(50)
)
INSERT INTO @t VALUES (1, 'abc1234567890')
SELECT ColID, ColString, dbo.AlphaOnly(ColString)
FROM @t
Run Code Online (Sandbox Code Playgroud)
输出
ColID ColString
----- ------------- ---
1 abc1234567890 abc
Run Code Online (Sandbox Code Playgroud)
第 2 轮 - 数据驱动的黑名单
-- ============================================
-- Create a table of blacklist characters
-- ============================================
IF EXISTS (SELECT * FROM sys.tables WHERE [object_id] = OBJECT_ID('dbo.CharacterBlacklist'))
DROP TABLE dbo.CharacterBlacklist
GO
CREATE TABLE dbo.CharacterBlacklist (
CharID int IDENTITY,
DisallowedCharacter nchar(1) NOT NULL
)
GO
INSERT INTO dbo.CharacterBlacklist (DisallowedCharacter) VALUES (N'0')
INSERT INTO dbo.CharacterBlacklist (DisallowedCharacter) VALUES (N'1')
INSERT INTO dbo.CharacterBlacklist (DisallowedCharacter) VALUES (N'2')
INSERT INTO dbo.CharacterBlacklist (DisallowedCharacter) VALUES (N'3')
INSERT INTO dbo.CharacterBlacklist (DisallowedCharacter) VALUES (N'4')
INSERT INTO dbo.CharacterBlacklist (DisallowedCharacter) VALUES (N'5')
INSERT INTO dbo.CharacterBlacklist (DisallowedCharacter) VALUES (N'6')
INSERT INTO dbo.CharacterBlacklist (DisallowedCharacter) VALUES (N'7')
INSERT INTO dbo.CharacterBlacklist (DisallowedCharacter) VALUES (N'8')
INSERT INTO dbo.CharacterBlacklist (DisallowedCharacter) VALUES (N'9')
GO
-- ====================================
IF EXISTS (SELECT * FROM sys.objects WHERE [object_id] = OBJECT_ID('dbo.StripBlacklistCharacters'))
DROP FUNCTION dbo.StripBlacklistCharacters
GO
CREATE FUNCTION dbo.StripBlacklistCharacters (
@String nvarchar(100)
)
RETURNS varchar(100)
AS BEGIN
DECLARE @blacklistCt int
DECLARE @ct int
DECLARE @c nchar(1)
SELECT @blacklistCt = COUNT(*) FROM dbo.CharacterBlacklist
SET @ct = 0
WHILE @ct < @blacklistCt BEGIN
SET @ct = @ct + 1
SELECT @String = REPLACE(@String, DisallowedCharacter, N'')
FROM dbo.CharacterBlacklist
WHERE CharID = @ct
END
RETURN (@String)
END
GO
-- ====================================
DECLARE @s nvarchar(24)
SET @s = N'abc1234def5678ghi90jkl'
SELECT
@s AS OriginalString,
dbo.StripBlacklistCharacters(@s) AS ResultString
Run Code Online (Sandbox Code Playgroud)
输出
OriginalString ResultString
------------------------ ------------
abc1234def5678ghi90jkl abcdefghijkl
Run Code Online (Sandbox Code Playgroud)
我对读者的挑战:你能让这更有效吗?使用递归怎么样?
原答案
TRANSLATE()SQL Server 2017+ 的另一个可能选项(没有循环和/或递归)是使用和的基于字符串的方法REPLACE()。
T-SQL语句:
DECLARE @pattern varchar(52) = 'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ'
SELECT
v.[Text],
REPLACE(
TRANSLATE(
v.[Text],
REPLACE(TRANSLATE(v.[Text], @pattern, REPLICATE('a', LEN(@pattern))), 'a', ''),
REPLICATE('0', LEN(REPLACE(TRANSLATE(v.[Text], @pattern, REPLICATE('a', LEN(@pattern))), 'a', '')))
),
'0',
''
) AS AlphabeticCharacters
FROM (VALUES
('abc1234def5678ghi90jkl#@$&'),
('1234567890'),
('JAHDBESBN%*#*@*($E*sd55bn')
) v ([Text])
Run Code Online (Sandbox Code Playgroud)
或作为函数:
CREATE FUNCTION dbo.RemoveNonAlphabeticCharacters (@Text varchar(1000))
RETURNS varchar(1000)
AS BEGIN
DECLARE @pattern varchar(52) = 'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ'
SET @text = REPLACE(
TRANSLATE(
@Text,
REPLACE(TRANSLATE(@Text, @pattern, REPLICATE('a', LEN(@pattern))), 'a', ''),
REPLICATE('0', LEN(REPLACE(TRANSLATE(@Text, @pattern, REPLICATE('a', LEN(@pattern))), 'a', '')))
),
'0',
''
)
RETURN @Text
END
Run Code Online (Sandbox Code Playgroud)
更新
感谢@tttugates 的评论,我发现了这种方法的一个小问题。实际的边缘情况作为'V 0 S'输入。该LEN()函数会截断尾随空格,从而导致“TRANSLATE 内置函数的第二个和第三个参数必须包含相同数量的字符”错误消息。解决方案是对长度计算进行微小的更改,但方法保持不变:
'a',并在之后替换为''。'0',然后替换为''。LEN(REPLACE(@text, ' ', '.'))。更新功能:
CREATE FUNCTION dbo.RemoveNonAlphabeticCharacters (@Text varchar(1000))
RETURNS varchar(1000)
AS BEGIN
DECLARE @pattern varchar(52) = 'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ'
DECLARE @translations varchar(1000)
SET @translations = TRANSLATE(@Text, @pattern, REPLICATE('a', LEN(@pattern)))
SET @translations = REPLACE(@translations, 'a', '')
SET @Text = TRANSLATE(@Text, @translations, REPLICATE('0', LEN(REPLACE(@translations, ' ', '.'))))
SET @Text = REPLACE(@Text, '0', '')
RETURN @Text
END
Run Code Online (Sandbox Code Playgroud)
在查看了所有给定的解决方案后,我认为必须有一个纯 SQL 方法,不需要函数或 CTE / XML 查询,并且不涉及难以维护的嵌套 REPLACE 语句。这是我的解决方案:
SELECT
x
,CASE WHEN a NOT LIKE '%' + SUBSTRING(x, 1, 1) + '%' THEN '' ELSE SUBSTRING(x, 1, 1) END
+ CASE WHEN a NOT LIKE '%' + SUBSTRING(x, 2, 1) + '%' THEN '' ELSE SUBSTRING(x, 2, 1) END
+ CASE WHEN a NOT LIKE '%' + SUBSTRING(x, 3, 1) + '%' THEN '' ELSE SUBSTRING(x, 3, 1) END
+ CASE WHEN a NOT LIKE '%' + SUBSTRING(x, 4, 1) + '%' THEN '' ELSE SUBSTRING(x, 4, 1) END
+ CASE WHEN a NOT LIKE '%' + SUBSTRING(x, 5, 1) + '%' THEN '' ELSE SUBSTRING(x, 5, 1) END
+ CASE WHEN a NOT LIKE '%' + SUBSTRING(x, 6, 1) + '%' THEN '' ELSE SUBSTRING(x, 6, 1) END
-- Keep adding rows until you reach the column size
AS stripped_column
FROM (SELECT
column_to_strip AS x
,'ABCDEFGHIJKLMNOPQRSTUVWXYZ' AS a
FROM my_table) a
Run Code Online (Sandbox Code Playgroud)
这样做的优点是有效字符包含在子查询的一个字符串中,从而可以轻松地重新配置一组不同的字符。
缺点是您必须为每个字符添加一行 SQL,最多可达列的大小。为了使该任务更容易,我只使用了下面的 Powershell 脚本,此示例针对 VARCHAR(64):
1..64 | % {
" + CASE WHEN a NOT LIKE '%' + SUBSTRING(x, {0}, 1) + '%' THEN '' ELSE SUBSTRING(x, {0}, 1) END" -f $_
} | clip.exe
Run Code Online (Sandbox Code Playgroud)
小智 5
这是一个不需要创建函数或列出要替换的字符的所有实例的解决方案。它结合使用递归 WITH 语句和 PATINDEX 来查找不需要的字符。它将替换列中所有不需要的字符 - 任何给定字符串中最多包含 100 个唯一的坏字符。(例如“ABC123DEF234”将包含 4 个坏字符 1、2、3 和 4)100 限制是 WITH 语句中允许的最大递归数,但这不会对要处理的行数施加限制,即仅受可用内存的限制。
如果您不想要 DISTINCT 结果,您可以从代码中删除这两个选项。
-- Create some test data:
SELECT * INTO #testData
FROM (VALUES ('ABC DEF,K.l(p)'),('123H,J,234'),('ABCD EFG')) as t(TXT)
-- Actual query:
-- Remove non-alpha chars: '%[^A-Z]%'
-- Remove non-alphanumeric chars: '%[^A-Z0-9]%'
DECLARE @BadCharacterPattern VARCHAR(250) = '%[^A-Z]%';
WITH recurMain as (
SELECT DISTINCT CAST(TXT AS VARCHAR(250)) AS TXT, PATINDEX(@BadCharacterPattern, TXT) AS BadCharIndex
FROM #testData
UNION ALL
SELECT CAST(TXT AS VARCHAR(250)) AS TXT, PATINDEX(@BadCharacterPattern, TXT) AS BadCharIndex
FROM (
SELECT
CASE WHEN BadCharIndex > 0
THEN REPLACE(TXT, SUBSTRING(TXT, BadCharIndex, 1), '')
ELSE TXT
END AS TXT
FROM recurMain
WHERE BadCharIndex > 0
) badCharFinder
)
SELECT DISTINCT TXT
FROM recurMain
WHERE BadCharIndex = 0;
Run Code Online (Sandbox Code Playgroud)
信不信由你,在我的系统中,这个丑陋的功能比G Mastros优雅的功能要好。
CREATE FUNCTION dbo.RemoveSpecialChar (@s VARCHAR(256))
RETURNS VARCHAR(256)
WITH SCHEMABINDING
BEGIN
IF @s IS NULL
RETURN NULL
DECLARE @s2 VARCHAR(256) = '',
@l INT = LEN(@s),
@p INT = 1
WHILE @p <= @l
BEGIN
DECLARE @c INT
SET @c = ASCII(SUBSTRING(@s, @p, 1))
IF @c BETWEEN 48 AND 57
OR @c BETWEEN 65 AND 90
OR @c BETWEEN 97 AND 122
SET @s2 = @s2 + CHAR(@c)
SET @p = @p + 1
END
IF LEN(@s2) = 0
RETURN NULL
RETURN @s2
Run Code Online (Sandbox Code Playgroud)