通过xmltype了解解释计划

Fra*_*nta 5 oracle xpath xmltype sql-execution-plan

在这样的查询中,我遇到了由错误的xpath(属性谓词中缺少'@')导致的性能问题:

从table中选择extractvalue(field,'// item [attr ="value"]'),其中field1 =:1;

我期待一个异常,但似乎Oracle接受这个特殊的xpath,有意义吗?

我试图针对该查询执行解释计划,但结果很奇怪,有人可以帮我理解吗?

我用这段代码重现了这个环境

SELECT * FROM V$VERSION;
/*
Oracle Database 11g Release 11.2.0.3.0 - 64bit Production
PL/SQL Release 11.2.0.3.0 - Production
"CORE   11.2.0.3.0  Production"
TNS for Linux: Version 11.2.0.3.0 - Production
NLSRTL Version 11.2.0.3.0 - Production
*/

create table TMP_TEST_XML(
  id number,
 content_xml xmltype 
);
/
create unique index IDX_TMP_TEST_XML on TMP_TEST_XML(id);
/
declare
  xml xmltype := xmltype('<root>
    <a key="A">Aaa</a>
    <b key="B">Bbb</b>
    <c key="C">Ccc</c>
    <d key="D">Ddd</d>
    <e key="E">Eee</e>
    <f key="F">Fff</f>
    <g key="G">Ggg</g>
    <h key="H">Hhh</h>
    <i key="I">Iii</i>
    <l key="L">Lll</l>
</root>');
begin

  for idx in 1..10000 
  loop
    insert into TMP_TEST_XML values (idx, xml);
  end loop;

  commit;

end;
/
--explain plan xpath without '@' (wrong)
EXPLAIN PLAN SET statement_id = 'planXml1' FOR
select extractvalue(content_xml, '/root/g[key="G"]') from TMP_TEST_XML where id between 120 and 130;
/    
select plan_table_output
from table(dbms_xplan.display('plan_table',null,'advanced'));
/
/*
------------------------------------------------------------------------------------------------
| Id  | Operation                   | Name             | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |                  |    24 | 48360 |     4   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE             |                  |     1 |     4 |            |          |
|   2 |   NESTED LOOPS SEMI         |                  |   667K|  2606K|   223K  (1)| 00:44:37 |
|   3 |    XPATH EVALUATION         |                  |       |       |            |          |
|*  4 |    XPATH EVALUATION         |                  |       |       |            |          |
|   5 |  TABLE ACCESS BY INDEX ROWID| TMP_TEST_XML     |    24 | 48360 |     4   (0)| 00:00:01 |
|*  6 |   INDEX RANGE SCAN          | IDX_TMP_TEST_XML |    43 |       |     2   (0)| 00:00:01 |
------------------------------------------------------------------------------------------------
*/
/
-- explain plan xpath with '@' (correct)
EXPLAIN PLAN SET statement_id = 'planXml1' FOR
select extractvalue(content_xml, '/root/g[@key="G"]') from TMP_TEST_XML where id between 120 and 130;
/
select plan_table_output
from table(dbms_xplan.display('plan_table',null,'advanced'));
/
/*
------------------------------------------------------------------------------------------------
| Id  | Operation                   | Name             | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |                  |    24 | 48360 |     4   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE             |                  |     1 |     4 |            |          |
|*  2 |   XPATH EVALUATION          |                  |       |       |            |          |
|   3 |  TABLE ACCESS BY INDEX ROWID| TMP_TEST_XML     |    24 | 48360 |     4   (0)| 00:00:01 |
|*  4 |   INDEX RANGE SCAN          | IDX_TMP_TEST_XML |    43 |       |     2   (0)| 00:00:01 |
------------------------------------------------------------------------------------------------
*/
Run Code Online (Sandbox Code Playgroud)

在第一个解释中,有一个'嵌套循环'(第2行),其基数为667K,在第二个中消失.在同一个表中插入更多记录并执行一个新的解释平面(没有'@')该值始终为667K.

它代表什么价值?

Mat*_*ini 1

我预计会出现异常,但 Oracle 似乎接受这个特定的 xpath,有什么意义吗?

嗯,是。就其本身而言,xpath/root/g[key="G"]获取具有标签“key”和值“G”的子节点的节点。因此,即使 extractvalue 会失败(返回多个节点),这也是可行的:

select extract(xmltype('<root>
<a key="A">Aaa</a>
<g key="G"><key>G</key>Ggg</g>
<h key="H">Hhh</h></root>'),'/root/g[key="G"]').getStringVal() from dual;
Run Code Online (Sandbox Code Playgroud)

它返回<g key="G"><key>G</key>Ggg</g>

在这种搜索中,高成本是合理的,因为属性可能比其他类型的子节点更优化和可搜索(可以说,每个标签只能有一个具有特定名称的子节点,而标签可以重复多次)。