Oracle DB中XMLTYPE列中的XML编码

mil*_*ros 1 xml oracle toad character-encoding oracle12c

我有一个这样创建的表:

create table b (data timestamp, value XMLTYPE);
Run Code Online (Sandbox Code Playgroud)

我在TOAD 12.6中运行此脚本以将XML存储在表中。

DECLARE
    lc_Soap         CLOB;
    lc_Request      CLOB;
    px_RequestXML   XMLTYPE
        := XMLTYPE ('<test><test1>ABDDÇJJSõ</test1></test>');
BEGIN
    DELETE b;

    lc_Soap :=
        '<?xml version="1.0" encoding="ISO-8859-1"?>
               <s:Envelope xmlns:s="http://schemas.xmlsoap.org/soap/envelope/">
                  <s:Header>
                      <h:AxisValues xmlns="urn:/microsoft/multichannelframework/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:h="urn:/microsoft/multichannelframework/">
                          <User xmlns="">TEST</User>
                      </h:AxisValues>
                  </s:Header>
                  <s:Body xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
                      <substr/>
                  </s:Body>
              </s:Envelope>';

    lc_Request :=
        pkg_utils.replace_clob (lc_Soap,
                                '<substr/>',
                                xml_utils.XMLTypeToClob (px_RequestXML));

    px_RequestXML := XMLTYPE.createXML (lc_Request);

    INSERT INTO b
         VALUES (SYSTIMESTAMP, px_RequestXML);

    COMMIT;
END;
Run Code Online (Sandbox Code Playgroud)

当我尝试查看VALUE列中的内容时,得到了这种编码UTF-8

<?xml version="1.0" encoding="UTF-8"?>
<s:Envelope xmlns:s="http://schemas.xmlsoap.org/soap/envelope/">
  <s:Header>
    <h:AxisValues xmlns="urn:/microsoft/multichannelframework/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:h="urn:/microsoft/multichannelframework/">
      <User xmlns="">TEST</User>
    </h:AxisValues>
  </s:Header>
  <s:Body xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
    <test>
      <test1>ABDDÇJJSõ</test1>
    </test>
  </s:Body>
</s:Envelope>
Run Code Online (Sandbox Code Playgroud)

但是,此脚本是为在其他DB用户或Oracle JOB中运行而构建的。在这种情况下,编码是不同的:

<?xml version="1.0" encoding="WINDOWS-1252"?>
<s:Envelope xmlns:s="http://schemas.xmlsoap.org/soap/envelope/">
  <s:Header>
    <h:AxisValues xmlns="urn:/microsoft/multichannelframework/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:h="urn:/microsoft/multichannelframework/">
      <User xmlns="">TEST</User>
    </h:AxisValues>
  </s:Header>
  <s:Body xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
    <test>
      <test1>ABDDÇJJSõ</test1>
    </test>
  </s:Body>
</s:Envelope>
Run Code Online (Sandbox Code Playgroud)

NLS_CHARACTERSETDB 的参数为WE8MSWIN1252。为什么要追加?而且我可以始终将谁存储为UTF-8?

Phi*_*erg 5

Oracle将使用客户端字符集XMLTYPE从CLOB或String 创建一个,然后完全忽略XML prolog中的编码(请参阅docs)。您可以设置encoding="blabla",它将起作用。仅当您从BLOB创建XMLTYPE时,Oracle才会接受XML序言中的编码。

客户端环境在读取时也会驱动编码XMLTYPE。如果要使XML文档以UTF-8格式编码而与客户端编码无关,则必须将其检索为BLOB。

通过 getBlobVal()

SELECT (c2).getBlobVal(nls_charset_id('UTF8')) FROM b;
Run Code Online (Sandbox Code Playgroud)

或通过 xmlserialize()

SELECT xmlserialize(DOCUMENT c2 AS BLOB ENCODING 'UTF-8') FROM b;
Run Code Online (Sandbox Code Playgroud)