如何在oracle 9i中最好地分割csv字符串

Joy*_*yce 8 csv oracle tokenize

我希望能够在Oracle 9i中拆分csv字符串

我已经阅读了以下文章 http://www.oappssurd.com/2009/03/string-split-in-oracle.html

但我不明白如何使这项工作.以下是我的一些问题

  1. 这可以在Oracle 9i中使用,如果没有,为什么不呢?
  2. 是否有更好的方法来分割csv字符串然后上面提出的解决方案?
  3. 我需要创建一个新类型吗?如果是这样,我需要特定的特权吗?
  4. 我可以在函数中声明类型吗?

Rob*_*ijk 16

乔伊斯,

以下是三个例子:

1)使用dbms_utility.comma_to_table.这不是通用例程,因为元素应该是有效的标识符.通过一些肮脏的技巧,我们可以使它更通用:

SQL> declare
  2    cn_non_occuring_prefix constant varchar2(4) := 'zzzz';
  3    mystring varchar2(2000):='a:sd:dfg:31456:dasd: :sdfsdf'; -- just an example
  4    l_tablen binary_integer;
  5    l_tab    dbms_utility.uncl_array;
  6  begin
  7    dbms_utility.comma_to_table
  8    ( list   => cn_non_occuring_prefix || replace(mystring,':',','||cn_non_occuring_prefix)
  9    , tablen => l_tablen
 10    , tab    => l_tab
 11    );
 12    for i in 1..l_tablen
 13    loop
 14      dbms_output.put_line(substr(l_tab(i),1+length(cn_non_occuring_prefix)));
 15    end loop;
 16  end;
 17  /
a
sd
dfg
31456
dasd

sdfsdf

PL/SQL-procedure is geslaagd.
Run Code Online (Sandbox Code Playgroud)

2)按级别使用SQL连接.如果您使用10g或更高版本,则可以使用逐级连接方法与正则表达式结合使用,如下所示:

SQL> declare
  2    mystring varchar2(2000):='a:sd:dfg:31456:dasd: :sdfsdf'; -- just an example
  3  begin
  4    for r in
  5    ( select regexp_substr(mystring,'[^:]+',1,level) element
  6        from dual
  7     connect by level <= length(regexp_replace(mystring,'[^:]+')) + 1
  8    )
  9    loop
 10      dbms_output.put_line(r.element);
 11    end loop;
 12  end;
 13  /
a
sd
dfg
31456
dasd

sdfsdf

PL/SQL-procedure is geslaagd.
Run Code Online (Sandbox Code Playgroud)

3)再次使用SQL的连接级别,但现在结合好旧的SUBSTR/INSTR,以防你使用的是版本9,就像你一样:

    SQL> declare
      2    mystring varchar2(2000):='a:sd:dfg:31456:dasd: :sdfsdf'; -- just an example
      3  begin
      4    for r in
      5    ( select substr
      6             ( str
      7             , instr(str,':',1,level) + 1
      8             , instr(str,':',1,level+1) - instr(str,':',1,level) - 1
      9             ) element
     10        from (select ':' || mystring || ':' str from dual)
     11     connect by level <= length(str) - length(replace(str,':')) - 1
     12    )
     13    loop
     14      dbms_output.put_line(r.element);
     15    end loop;
     16  end;
     17  /
    a
    sd
    dfg
    31456
    dasd

    sdfsdf

PL/SQL-procedure is geslaagd.
Run Code Online (Sandbox Code Playgroud)

你可以在这篇博文中看到更多这样的技术:http: //rwijk.blogspot.com/2007/11/interval-based-row-generation.html

希望这可以帮助.

问候,Rob.


解决你的评论:

将分隔值插入规范化表的示例.

首先创建表:

SQL> create table csv_table (col)
  2  as
  3  select 'a,sd,dfg,31456,dasd,,sdfsdf' from dual union all
  4  select 'a,bb,ccc,dddd' from dual union all
  5  select 'zz,yy,' from dual
  6  /

Table created.

SQL> create table normalized_table (value varchar2(10))
  2  /

Table created.
Run Code Online (Sandbox Code Playgroud)

因为您似乎对dbms_utility.comma_to_table方法感兴趣,所以我在这里提到它.但是,我当然不推荐这种变体,因为标识符有怪癖,而且因为行处理速度慢.

SQL> declare
  2    cn_non_occuring_prefix constant varchar2(4) := 'zzzz';
  3    l_tablen binary_integer;
  4    l_tab    dbms_utility.uncl_array;
  5  begin
  6    for r in (select col from csv_table)
  7    loop
  8      dbms_utility.comma_to_table
  9      ( list   => cn_non_occuring_prefix || replace(r.col,',',','||cn_non_occuring_prefix)
 10      , tablen => l_tablen
 11      , tab    => l_tab
 12      );
 13      forall i in 1..l_tablen
 14        insert into normalized_table (value)
 15        values (substr(l_tab(i),length(cn_non_occuring_prefix)+1))
 16      ;
 17    end loop;
 18  end;
 19  /

PL/SQL procedure successfully completed.

SQL> select * from normalized_table
  2  /

VALUE
----------
a
sd
dfg
31456
dasd

sdfsdf
a
bb
ccc
dddd
zz
yy


14 rows selected.
Run Code Online (Sandbox Code Playgroud)

我推荐这个单一的SQL变体:

SQL> truncate table normalized_table
  2  /

Table truncated.

SQL> insert into normalized_table (value)
  2   select substr
  3          ( col
  4          , instr(col,',',1,l) + 1
  5          , instr(col,',',1,l+1) - instr(col,',',1,l) - 1
  6          )
  7     from ( select ',' || col || ',' col from csv_table )
  8        , ( select level l from dual connect by level <= 100 )
  9    where l <= length(col) - length(replace(col,',')) - 1
 10  /

14 rows created.

SQL> select * from normalized_table
  2  /

VALUE
----------
a
a
zz
sd
bb
yy
dfg
ccc

31456
dddd
dasd

sdfsdf

14 rows selected.
Run Code Online (Sandbox Code Playgroud)

问候,Rob.


Mic*_*aer 8

这是Oracle的字符串标记器,比那个页面更直接,但不知道它是否同样快:

create or replace function splitter_count(str in varchar2, delim in char) return int as
val int;
begin
  val := length(replace(str, delim, delim || ' '));
  return val - length(str); 
end;

create type token_list is varray(100) of varchar2(200);

CREATE or replace function tokenize (str varchar2, delim char) return token_list as
ret token_list;
target int;
i int;
this_delim int;
last_delim int;
BEGIN
  ret := token_list();
  i := 1;
  last_delim := 0;
  target := splitter_count(str, delim);
  while i <= target
  loop
    ret.extend();
    this_delim := instr(str, delim, 1, i);
    ret(i):= substr(str, last_delim + 1, this_delim - last_delim -1);
    i := i + 1;
    last_delim := this_delim;
  end loop;
  ret.extend();
  ret(i):= substr(str, last_delim + 1);
  return ret;
end;
Run Code Online (Sandbox Code Playgroud)

你可以像这样使用它:

select tokenize('hi you person', ' ') from dual;
VARCHAR(hi,you,person)
Run Code Online (Sandbox Code Playgroud)