lxml没有执行xslt转换

use*_*536 1 python xml xslt lxml

使用此代码:

from lxml import etree

with open( 'C:\\Python33\\projects\\xslt', 'r' ) as xslt, open( 'C:\\Python33\\projects\\result', 'a+' ) as result, open( 'C:\\Python33\\projects\\xml', 'r' ) as xml:
    s_xml = xml.read()
    s_xslt = xslt.read()
    transform = etree.XSLT(etree.XML(s_xslt))
    out = transform(etree.XML(s_xml))
    result.write(out)
Run Code Online (Sandbox Code Playgroud)

我收到此错误:

Traceback (most recent call last):
  File "<pyshell#7>", line 1, in <module>
from projects.xslt_transform import trans
  File ".\projects\xslt_transform.py", line 17, in <module>
transform = etree.XSLT(etree.XML(s_xslt))
  File "xslt.pxi", line 409, in lxml.etree.XSLT.__init__ (src\lxml\lxml.etree.c:150256)
lxml.etree.XSLTParseError: Invalid expression
Run Code Online (Sandbox Code Playgroud)

这对xml/xslt文件可以与其他工具一起使用.

此外,我必须摆脱两个文件的顶部声明中的编码属性,以便不获取:

ValueError: Unicode strings with encoding declaration are not supported. Please use bytes input or XML fragments without declaration.
Run Code Online (Sandbox Code Playgroud)

可以相关吗?

编辑:

这也不起作用(我得到同样的错误):

with open( 'C:\\Python33\\projects\\xslt', 'r',encoding="utf-8" ) as xslt, open( 'C:\\Python33\\projects\\result', 'a+',encoding="utf-8" ) as result, open( 'C:\\Python33\\projects\\xml', 'r',encoding="utf-8" ) as xml:
    s_xml = etree.parse(BytesIO(bytes(xml.read(),'UTF-8')))
    s_xslt = etree.parse(BytesIO(bytes(xslt.read(),'UTF-8')))
    transform = etree.XSLT(s_xslt)
    out = transform(s_xml)
    print(out.tostring())
Run Code Online (Sandbox Code Playgroud)

阅读lxml源代码:这会返回一个异常:

xslt.xsltParseStylesheetDoc(c_doc)
Run Code Online (Sandbox Code Playgroud)

所以它似乎是一个实际的解析错误.它可以与名称空间相关吗?

编辑解决:

s_xml = etree.parse(xml.read())
s_xslt = etree.parse(xslt.read())
Run Code Online (Sandbox Code Playgroud)

多亏了马拉克

Tom*_*lak 7

解析XML比"打开文本文件,将生成的字符串填充到etree中"更复杂.

XML文件是DOM树的序列化表示.它们不会被视为文本,即使它们是文本文件的形状.它们采用多字节编码,并找出某个文件使用的编码是微不足道的.

XML解析器具有内置的适当检测机制,因此它们应该用于打开XML文件.该基本open()+ read()电话是不是足够正确处理文件的内容.

lxml.etree提供parse()功能,它可以接受一些类型的参数:

  • 打开文件对象(确保以二进制模式打开)
  • 一个类文件对象,它有一个.read(byte_count)方法,在每次调用时返回一个字节字符串
  • 文件名字符串
  • HTTP或FTP URL字符串

然后将正确地将关联的文档解析回DOM树.

您的代码看起来应该更像这样:

from lxml import etree

f_xsl = 'C:\\Python33\\projects\\xslt'
f_xml = 'C:\\Python33\\projects\\xml'
f_out = 'C:\\Python33\\projects\\result'

transform = etree.XSLT(etree.parse(f_xsl))
result = transform(etree.parse(f_xml))
result.write(f_out)
Run Code Online (Sandbox Code Playgroud)