use*_*653 2 sql google-bigquery
我正在尝试使用 Google Big Query 查找字符串中子字符串第二次出现的索引。
例如,在字符串 'challcha' 中,第二次出现 'ch' 将在位置 6。
我知道这可以在 Oracle 中使用 CharIndex 来实现。我正在尝试在 Google Big Query 中实现这一点。
任何帮助表示赞赏!
对于具有纯SQL 字符串函数的BigQuery
SELECT test,
INSTR(test, 'ch') + 1 + INSTR(SUBSTR(test, INSTR(test, 'ch') + 2), 'ch') AS pos,
FROM
(SELECT 'challcha' AS test),
(SELECT 'chcha' AS test),
(SELECT 'chha' AS test)
WHERE
INSTR(SUBSTR(test, INSTR(test, 'ch') + 2), 'ch') > 0
Run Code Online (Sandbox Code Playgroud)
注意: INSTR 区分大小写,因此如果您有大小写混合的情况,您可能希望将所有内容都放在 LOWER 或 UPPER 中
SELECT test, pos FROM JS(
(
SELECT test FROM
(SELECT 'challcha' AS test),
(SELECT 'chcha' AS test),
(SELECT 'chha' AS test)
) ,
test,
"[{name: 'test', type:'string'},
{name: 'pos', type:'integer'}
]
",
"function(r, emit) {
var search = 'ch';
var pos1 = r.test.indexOf(search) + 1;
var pos2 = r.test.indexOf(search, pos1) + 1;
if (pos1 * pos2 == 0) pos2 = 0
emit({test: r.test, pos: pos2});
}"
)
Run Code Online (Sandbox Code Playgroud)
使用纯 BigQuery正则表达式函数
SELECT test,
LENGTH(REGEXP_EXTRACT(test, r'(?i)(.*?)ch')) + 3 +
LENGTH(REGEXP_EXTRACT(REGEXP_EXTRACT(test, r'(?i)ch(.*)'), r'(?i)(.*?)ch')) AS len,
FROM
(SELECT 'ChallCha' AS test),
(SELECT 'abChallCha' AS test),
(SELECT 'chcha' AS test),
(SELECT 'chha' AS test)
Run Code Online (Sandbox Code Playgroud)