Rob*_*dez 0 sql oracle plsql pipelined-function oracle19c
我的 Oracle 管道功能有问题,我非常想了解发生了什么。我的 Oracle 数据库是版本 19c,在 Red Hat 7.2 上运行并在AL32UTF8为字符集。
让我解释一下这个场景。
\n我有以下两种类型和一种管道函数的设置,以便使用并行进程生成文件,从而可以极大地加快大文件的生成速度。
\n两种类型
\n--\n-- DUMP_PARALLEL_OBJECT (Type) \n--\nCREATE OR REPLACE TYPE CPL_DATA_OUT.dump_parallel_object AS OBJECT\n(file_name VARCHAR2 (128), no_records NUMBER, seq_id NUMBER);\n/\n\n--\n-- DUMP_PARALLEL_OBJECT_NTT (Type) \n--\nCREATE OR REPLACE TYPE CPL_DATA_OUT.dump_parallel_object_ntt AS TABLE OF cpl_data_out.dump_parallel_object;\n/\nRun Code Online (Sandbox Code Playgroud)\n流水线功能
\n这是管道函数,用于获取我可以加入的块中的输出文件,然后使用cat在 Linux 中使用。
CREATE OR REPLACE function CPL_DATA_OUT.fn_generate_parallel_file\n(\np_source IN SYS_REFCURSOR,\np_filename IN VARCHAR2,\np_directory IN VARCHAR2,\np_extension IN VARCHAR2 DEFAULT 'csv',\np_limit IN NUMBER DEFAULT 10000\n) return dump_parallel_object_ntt\npipelined\nparallel_enable (partition p_source by any)\nas\n type row_ntt is table of varchar2(32767);\n v_rows row_ntt;\n v_file UTL_FILE.FILE_TYPE;\n v_buffer VARCHAR2(32767);\n v_sid NUMBER;\n v_name VARCHAR2(128);\n v_lines PLS_INTEGER := 0;\n c_eol CONSTANT VARCHAR2(1) := CHR(10);\n c_eollen CONSTANT PLS_INTEGER := LENGTH(c_eol);\n c_maxline CONSTANT PLS_INTEGER := 32767;\nbegin\n\n SELECT generate_random_number.nextval INTO v_sid FROM dual;\n v_name := p_filename || '_' || TO_CHAR(v_sid) || '.' || p_extension;\n v_file := UTL_FILE.FOPEN(p_directory, v_name, 'w', 32767);\n\n LOOP\n FETCH p_source BULK COLLECT INTO v_rows LIMIT p_limit;\n\n FOR i IN 1 .. v_rows.COUNT LOOP\n\n IF LENGTH(v_buffer) + c_eollen + LENGTH(v_rows(i)) <= c_maxline THEN\n v_buffer := v_buffer || c_eol || v_rows(i);\n ELSE\n IF v_buffer IS NOT NULL THEN\n UTL_FILE.PUT_LINE(v_file, v_buffer);\n END IF;\n v_buffer := v_rows(i);\n END IF;\n\n END LOOP;\n\n v_lines := v_lines + v_rows.COUNT;\n\n EXIT WHEN p_source%NOTFOUND;\n END LOOP;\n CLOSE p_source;\n\n UTL_FILE.PUT_LINE(v_file, v_buffer);\n UTL_FILE.FCLOSE(v_file);\n\n PIPE ROW (dump_parallel_object(v_name, v_lines, v_sid));\n RETURN;\n\nEND fn_generate_parallel_file;\n/\nRun Code Online (Sandbox Code Playgroud)\n我正在函数内使用序列来为这些文件分配唯一的编号。\n让我们测试一下该场景
\n有问题的表
\nSQL> desc SRD_OUT.FCT_EMPROLE_TRANSFORM\n Name Null? Type\n ----------------------------------------- -------- ----------------------------\n DAT_MONTH DATE\n PERSNR VARCHAR2(6 CHAR)\n ARBEITS_STATUS VARCHAR2(50 CHAR)\n NAME VARCHAR2(50 CHAR)\n VORNAME VARCHAR2(50 CHAR)\n FTE NUMBER\n WOCHENSTUNDEN NUMBER\n FUNKTION VARCHAR2(50 CHAR)\n OE VARCHAR2(70 CHAR)\n DIREKTION VARCHAR2(50 CHAR)\n BEREICH VARCHAR2(50 CHAR)\n N_NUMMER VARCHAR2(50 CHAR)\n FTE_VALUE NUMBER\n CENTERKEY VARCHAR2(200 CHAR)\n ROLLE VARCHAR2(200 CHAR)\n BEMESSUNGSFAKTOR VARCHAR2(50 CHAR)\n COD_PROCESS VARCHAR2(30 CHAR)\n DAT_EFFECTIVE DATE\n\nSQL> select count(*) from SRD_OUT.FCT_EMPROLE_TRANSFORM ;\n\n COUNT(*)\n----------\n 20436\nRun Code Online (Sandbox Code Playgroud)\n如果我针对dba_objects,一切都会按预期运行。
SQL> COL FILE_NAME FOR A50\nSQL> set lines 220\nSQL> r\n 1 SELECT *\n 2 FROM TABLE(\n 3 cpl_data_out.fn_generate_parallel_file(\n 4 CURSOR(\n 5 SELECT /*+ PARALLEL(s,10) */\n 6 "OWNER" ||'~'||\n 7 "OBJECT_NAME" ||'~'||\n 8 "SUBOBJECT_NAME" ||'~'||\n 9 "OBJECT_ID" ||'~'||\n 10 "DATA_OBJECT_ID" ||'~'||\n 11 "OBJECT_TYPE" ||'~'||\n 12 "CREATED" ||'~'||\n 13 "LAST_DDL_TIME" as csv\n 14 FROM DBA_OBJECTS s)\n 15 , 'test_file'\n 16 , 'DIR_SRD_OUT'\n 17 , 'csv')\n 18 ) nt\n 19*\n\nFILE_NAME NO_RECORDS SEQ_ID\n-------------------------------------------------- ---------- ----------\ntest_file_459.csv 25496 459\ntest_file_449.csv 25496 449\ntest_file_453.csv 25496 453\ntest_file_461.csv 25496 461\ntest_file_455.csv 25499 455\ntest_file_451.csv 25496 451\ntest_file_447.csv 25496 447\ntest_file_443.csv 25496 443\ntest_file_457.csv 25496 457\ntest_file_445.csv 25497 445\n\n10 rows selected.\nRun Code Online (Sandbox Code Playgroud)\n正如您所看到的,管道函数按预期工作,它创建了 10 个 csv 文件,我可以稍后使用它们加入cat。但是,如果我尝试针对上面显示的表运行它,就会发生这种情况(出于示例的目的,我只是使用表的某些列)
在职的
\nSQL> SELECT *\n 2 FROM TABLE(\n 3 cpl_data_out.fn_generate_parallel_file(\n 4 CURSOR(\n 5 SELECT /*+ PARALLEL(s,10) */\n 6 "DAT_MONTH" ||'~'||\n 7 "PERSNR" ||'~'||\n 8 "COD_PROCESS" ||'~'||\n 9 "DAT_EFFECTIVE"\n 10 as csv\n 11 FROM SRD_OUT.FCT_EMPROLE_TRANSFORM s)\n 12 , 'test_file'\n 13 , 'DIR_SRD_OUT'\n 14 , 'csv')\n 15* ) nt\nSQL> /\n\nFILE_NAME NO_RECORDS SEQ_ID\n-------------------------------------------------- ---------- ----------\ntest_file_569.csv 456 569\ntest_file_571.csv 489 571\ntest_file_575.csv 314 575\ntest_file_573.csv 483 573\ntest_file_577.csv 496 577\ntest_file_581.csv 487 581\ntest_file_579.csv 430 579\ntest_file_567.csv 3500 567\ntest_file_565.csv 3606 565\ntest_file_563.csv 10175 563\n\n10 rows selected.\nRun Code Online (Sandbox Code Playgroud)\n不工作
\nSQL> SELECT *\n 2 FROM TABLE(\n 3 cpl_data_out.fn_generate_parallel_file(\n 4 CURSOR(\n 5 SELECT /*+ PARALLEL(s,10) */\n 6 "DAT_MONTH" ||'~'||\n 7 "PERSNR" ||'~'||\n 8 "COD_PROCESS" ||'~'||\n 9 "DAT_EFFECTIVE" ||'~'||\n 10 "ROLLE"\n 11 as csv\n 12 FROM SRD_OUT.FCT_EMPROLE_TRANSFORM s)\n 13 , 'test_file'\n 14 , 'DIR_SRD_OUT'\n 15 , 'csv')\n 16* ) nt\nSQL> /\nERROR:\nORA-12801: error signaled in parallel query server P005\nORA-06502: PL/SQL: numeric or value error: character string buffer too small\nORA-06512: at "CPL_DATA_OUT.FN_GENERATE_PARALLEL_FILE", line 34\nORA-06512: at line 1\nRun Code Online (Sandbox Code Playgroud)\n两个查询之间的唯一区别是“ROLLE”列,其中包含 ASCII 扩展字符(如德语字母,例如“\xc3\xa4\xc3\xbc\xc3\xb6\xc3\x9f”)。包含此类字符的每一列都会发生这种情况。
\n实际上错误是指这一行:v_buffer := v_buffer || c_eol || v_rows(i);,但我不知道涉及这些字符时出了什么问题。
SQL> set pages 200\nSQL> r\n 1* select distinct rolle from SRD_OUT.FCT_EMPROLE_TRANSFORM\n\nROLLE\n------------------------\nFilialleiter (gro\xc3\x9fe Filiale)\nVertriebsdirektor Verm\xc3\xb6gensberatung\nRun Code Online (Sandbox Code Playgroud)\n我不太明白这些扩展 ASCII 字符和函数之间存在什么关系。我应该在我的函数中更改什么才能使其与这些字符一起使用?
\n感谢大家的帮助。
\n当你这样做时:
IF LENGTH(v_buffer) + c_eollen + LENGTH(v_rows(i)) <= c_maxline
Run Code Online (Sandbox Code Playgroud)
您正在计算缓冲区和集合变量中的字符数。当您只有单字节字符时没有问题,但对于任何多字节字符,您可能会遇到字符总数小于 32767,但字节数超过该值的情况。检查通过;但随后你会:
v_buffer := v_buffer || c_eol || v_rows(i)
Run Code Online (Sandbox Code Playgroud)
它超出了缓冲区的大小,并引发错误。如果您的缓冲区被声明为小于最大值并使用字符语义,您仍然可能会逃脱惩罚;但如果达到最大大小(以及任何语义),它将失败。
如果您计算字节而不是字符,则不会超出字节限制:
IF LENGTHB(v_buffer) + c_eollen + LENGTHB(v_rows(i)) <= c_maxline
Run Code Online (Sandbox Code Playgroud)