5 python xml recursion python-2.7
Python代码:
import xml.etree.ElementTree as ET
root = ET.parse("h.xml")
print root.findall('saybye')
Run Code Online (Sandbox Code Playgroud)
h.xml代码:
<hello>
<saybye>
<saybye>
</saybye>
</saybye>
<saybye>
</saybye>
</hello>
Run Code Online (Sandbox Code Playgroud)
代码输出,
[<Element 'saybye' at 0x7fdbcbbec690>, <Element 'saybye' at 0x7fdbcbbec790>]
Run Code Online (Sandbox Code Playgroud)
saybye
saybye
这里没有选择另一个孩子.那么,如何指示findall递归地向下走DOM树并收集所有三个saybye
元素?
Low*_*ell 11
如果您不怕一点 XPath,则可以使用//
表示查找任何后代节点的语法:
import xml.etree.ElementTree as ET
root = ET.parse("h.xml")
print(root.findall('.//saybye'))
Run Code Online (Sandbox Code Playgroud)
不支持完整的 XPath,但这里有一个列表:https : //docs.python.org/2/library/xml.etree.elementtree.html#supported-xpath-syntax
从2.7版开始,您可以使用xml.etree.ElementTree.Element.iter
:
import xml.etree.ElementTree as ET
root = ET.parse("h.xml")
print root.iter('saybye')
Run Code Online (Sandbox Code Playgroud)
见19.7.xml.etree.ElementTree - ElementTree XML API
引用findall
,
Element.findall()
只查找带有当前元素直接子元素的标签的元素。
由于它只找到直接子节点,因此我们需要递归地查找其他子节点,如下所示
>>> import xml.etree.ElementTree as ET
>>>
>>> def find_rec(node, element, result):
... for item in node.findall(element):
... result.append(item)
... find_rec(item, element, result)
... return result
...
>>> find_rec(ET.parse("h.xml"), 'saybye', [])
[<Element 'saybye' at 0x7f4fce206710>, <Element 'saybye' at 0x7f4fce206750>, <Element 'saybye' at 0x7f4fce2067d0>]
Run Code Online (Sandbox Code Playgroud)
更好的是,将其设为生成器函数,如下所示
>>> def find_rec(node, element):
... for item in node.findall(element):
... yield item
... for child in find_rec(item, element):
... yield child
...
>>> list(find_rec(ET.parse("h.xml"), 'saybye'))
[<Element 'saybye' at 0x7f4fce206a50>, <Element 'saybye' at 0x7f4fce206ad0>, <Element 'saybye' at 0x7f4fce206b10>]
Run Code Online (Sandbox Code Playgroud)