正则表达式检查删除字母以外的字符

use*_*653 -1 regex sql google-bigquery

我需要一个正则表达式来检查 DebugData,如果它包含字母 [a-zA-Z] 以外的任何数字或特殊字符,请将这些特殊字符替换为空格。这些表存在于 Google Big Query 中,我正在使用 IPython 笔记本对其进行查询。

示例:当 DebugData 为 Movist2 时,ActualCarrier 应为 Movist;当 DebugData 为 LAO GS2 时,ActualCarrier 应为 LAOGS2;当 DebugData 为 CLARO"3 时,ActualCarrier 为 CLARO。

SELECT 
Id, e.Carrier as AssignedCarrier, 
CASE
 WHEN lower(DebugData) LIKE 'jasp%' THEN 'Jasper' 
 WHEN lower(DebugData) LIKE 'telu%' THEN 'Telus'
 WHEN REGEXP_MATCH(DebugData,'\\w+\\d+') THEN DebugData
 WHEN REGEXP_MATCH(lower(DebugData),'\\d+') THEN c.Network
END
AS ActualCarrier
FROM debug_table
Run Code Online (Sandbox Code Playgroud)

这是我添加的声明:

ELSE REGEXP_REPLACE(lower(DebugData),'\\[^a-zA-Z]',' ')
Run Code Online (Sandbox Code Playgroud)

我仍然得到这个输出:

HardwareId  DebugData   ActualCarrier   count
550466188   CLARO"3      None            5
Run Code Online (Sandbox Code Playgroud)

Mik*_*ant 5

尝试

SELECT
  DebugData, 
  REGEXP_REPLACE(DebugData, r'[^a-zA-Z]', ' ') as ActualCarrier 
FROM
  (SELECT'Movist2' as DebugData),
  (SELECT'LAO GS2' as DebugData),
  (SELECT'CLARO"3' as DebugData)
Run Code Online (Sandbox Code Playgroud)

添加以解决其他评论/问题

SELECT
  DebugData, 
  CASE
    WHEN REGEXP_MATCH(LOWER(DebugData),r'^\d+$') THEN Network 
    ELSE REGEXP_REPLACE(LOWER(DebugData),r'[^a-zA-Z]', ' ') 
  END AS ActualCarrier
FROM
  (SELECT'123' AS DebugData, 'aaa' AS Network),
  (SELECT'Movist2' AS DebugData, 'bbb' AS Network),
  (SELECT'456' AS DebugData, 'ccc' AS Network),
  (SELECT'LAO GS2' AS DebugData, 'ddd' AS Network),
  (SELECT'CLARO"3' AS DebugData, 'eee' AS Network),
  (SELECT'Test' AS DebugData, 'fff' AS Network)
Run Code Online (Sandbox Code Playgroud)