postgres 中的 REGEXP_COUNT

May*_*ive 3 postgresql postgresql-9.3

我们正在从 Oracle 迁移到 Postgres。

这是 SQL,我用来从employee_name 列中提取数据并用于报告。

但现在我不确定如何做 regex_count 部分。甲骨文SQL

with A4 as 
(
select 'govinda j/INDIA_MH/9975215025' as employee_name from dual
)
select employee_name , 
TRIM(SUBSTR(upper(A4.employee_name),1,INSTR(A4.employee_name,'/',1,1)-1)) AS employee_name1,
  TRIM(SUBSTR(upper(A4.employee_name),INSTR(A4.employee_name,'/',1,1)+1,INSTR(A4.employee_name,'_',1,1)-INSTR(A4.employee_name,'/',1,1)-1)) AS Country,
  TRIM(SUBSTR(upper(A4.employee_name),INSTR(A4.employee_name,'_',1,1)+1,INSTR(A4.employee_name,'/',1,2)-INSTR(A4.employee_name,'_',1,1)-1)) AS STATE,
  CASE WHEN REGEXP_COUNT(A4.employee_name,'_')>1 THEN 'WRONG_NAME>1_'
       WHEN REGEXP_COUNT(A4.employee_name,'/')>2 THEN 'WRONG_NAME>2/'
       WHEN TRIM(SUBSTR(upper(A4.employee_name),INSTR(A4.employee_name,'/',1,1)+1,INSTR(A4.employee_name,'_',1,1)-INSTR(A4.employee_name,'/',1,1)-1))NOT IN
         ('INDIA','NEPAL') THEN 'WRONG_COUNTRY'
       ELSE 'CORRECT' END AS VALIDATION

       from A4
Run Code Online (Sandbox Code Playgroud)

在 Postgres 的帮助下,我可以将其转换为以下部分。

with A4 as 
(
select 'govinda j/INDIA_MH/9975215025'::text as employee_name
)
select employee_name,
       split_part(employee_name, '/', 1) as employee_name1,
       split_part(split_part(employee_name, '/', 2), '_', 1) as country,
       split_part(split_part(employee_name, '/', 2), '_', 2) as state
from A4
Run Code Online (Sandbox Code Playgroud)

验证部分无法转换 . 任何帮助都受到高度赞赏,因为我们对 postgres 非常陌生。

kli*_*lin 8

您可以创建自定义函数:

create or replace function number_of_chars(text, text)
returns integer language sql immutable as $$
    select length($1) - length(replace($1, $2, ''))
$$; 
Run Code Online (Sandbox Code Playgroud)

用:

with example(str) as (
values 
    ('a_b_c'),
    ('a___b'),
    ('abc')
)

select str, number_of_chars(str, '_') as count
from example

  str  | count  
-------+-------
 a_b_c |     2
 a___b |     3
 abc   |     0
(3 rows)
Run Code Online (Sandbox Code Playgroud)

请注意,上面的函数只计算字符串中某个字符的出现次数,并没有使用正则表达式,这通常更昂贵。

Postgres 的等价物regexp_count()可能如下所示:

create or replace function regexp_count(text, text)
returns integer language sql as $$
    select count(m)::int
    from regexp_matches($1, $2, 'g') m
$$; 

with example(str) as (
values 
    ('a_b_c'),
    ('a___b'),
    ('abc')
)

select str, regexp_count(str, '_') as single, regexp_count(str, '__') as double
from example

  str  | single | double 
-------+--------+--------
 a_b_c |      2 |      0
 a___b |      3 |      1
 abc   |      0 |      0
(3 rows)
Run Code Online (Sandbox Code Playgroud)