在 python lxml prettyprint 中更改制表符间距

d.g*_*ner 4 python xml lxml pretty-print

我有一个小脚本,它创建一个 xml 文档并使用prettyprint=true它制作一个格式正确的 xml 文档。但是,制表符缩进是 2 个空格,我想知道是否有办法将其更改为 4 个空格(我认为 4 个空格看起来更好)。有没有一种简单的方法来实现这一点?

代码片段:

doc = lxml.etree.SubElement(root, 'dependencies')
for depen in dependency_list:
    dependency = lxml.etree.SubElement(doc, 'dependency')
    lxml.etree.SubElement(dependency, 'groupId').text = depen.group_id
    lxml.etree.SubElement(dependency, 'artifactId').text = depen.artifact_id
    lxml.etree.SubElement(dependency, 'version').text = depen.version
    if depen.scope == 'provided' or depen.scope == 'test':
        lxml.etree.SubElement(dependency, 'scope').text = depen.scope
    exclusions = lxml.etree.SubElement(dependency, 'exclusions')
    exclusion = lxml.etree.SubElement(exclusions, 'exclusion')
    lxml.etree.SubElement(exclusion, 'groupId').text = '*'
    lxml.etree.SubElement(exclusion, 'artifactId').text = '*'
tree.write('explicit-pom.xml' , pretty_print=True)
Run Code Online (Sandbox Code Playgroud)

小智 6

如果有人仍在尝试实现这一点,可以使用etree.indent()lxml 4.5 中的方法来完成-

>>> etree.indent(root, space="    ")
>>> print(etree.tostring(root))
<root>
    <a>
        <b/>
    </a>
</root>
Run Code Online (Sandbox Code Playgroud)

https://lxml.de/tutorial.html#serialisation


Reg*_*May 5

python API 似乎无法实现这一点lxml

选项卡间距的可能解决方案是:

def prettyPrint(someRootNode):
    lines = lxml.etree.tostring(someRootNode, encoding="utf-8", pretty_print=True).decode("utf-8").split("\n")
    for i in range(len(lines)):
        line = lines[i]
        outLine = ""
        for j in range(0, len(line), 2):
            if line[j:j + 2] == "  ":
                outLine += "\t"
            else:
                outLine += line[j:]
                break
        lines[i] = outLine
    return "\n".join(lines)
Run Code Online (Sandbox Code Playgroud)

请注意,这不是很有效。只有在 C 代码中本机实现此功能才能实现高效率lxml