在Python中使用elementTree搜索和删除元素

Question

在Python中使用elementTree搜索和删除元素

我有一个XML文档,我想在其中搜索一些元素,如果它们符合某些标准,我想删除它们

但是,我似乎无法访问元素的父级,以便我可以删除它

file = open('test.xml', "r")
elem = ElementTree.parse(file)

namespace = "{http://somens}"

props = elem.findall('.//{0}prop'.format(namespace))
for prop in props:
    type = prop.attrib.get('type', None)
    if type == 'json':
        value = json.loads(prop.attrib['value'])
        if value['name'] == 'Page1.Button1':
            #here I need to access the parent of prop
            # in order to delete the prop

Run Code Online (Sandbox Code Playgroud)

有没有办法可以做到这一点？

谢谢

Answer 1

Con*_*ius 24

您可以使用相应的remove方法删除子元素.要删除元素,您必须调用其父remove方法.不幸的Element是没有提供其父母的参考,所以由你来跟踪父/子关系(这与你的使用相反elem.findall())

建议的解决方案可能如下所示:

root = elem.getroot()
for child in root:
    if child.name != "prop":
        continue
    if True:# TODO: do your check here!
        root.remove(child)

Run Code Online (Sandbox Code Playgroud)

PS:不使用prop.attrib.get(),使用prop.get(),如解释在这里.

对,那是正确的.lxml提供了一个`ElementTree`实现,它具有比接口通常声明的更多的功能.lxml中的`Element`类提供了`getparent()`方法来获取对父元素的引用. (4认同)
如果子元素是根目录下的多个元素,该怎么办？如果它在不同的深度怎么办？ (2认同)

Answer 2

ice*_*itz 9

我知道这是一个旧线程，但当我试图找出类似的任务时，它不断弹出。我不喜欢接受的答案有两个原因：

1）它不处理标签的多个嵌套级别。

2) 如果同一级别中的多个xml标签相继被删除，将会破坏。由于每个元素都是一个索引，因此Element._children在向前迭代时不应删除。

我认为更好的更通用的解决方案是这样的：

import xml.etree.ElementTree as et
file = 'test.xml'
tree = et.parse(file)
root = tree.getroot()

def iterator(parents, nested=False):
    for child in reversed(parents):
        if nested:
            if len(child) >= 1:
                iterator(child)
        if True:  # Add your entire condition here
            parents.remove(child)

iterator(root, nested=True)

Run Code Online (Sandbox Code Playgroud)

对于OP来说，这应该可行 - 但我没有您正在使用的数据来测试它是否完美。

import xml.etree.ElementTree as et
file = 'test.xml'
tree = et.parse(file)

namespace = "{http://somens}"
props = tree.findall('.//{0}prop'.format(namespace))

def iterator(parents, nested=False):
    for child in reversed(parents):
        if nested:
            if len(child) >= 1:
                iterator(child)
        if prop.attrib.get('type') == 'json':
            value = json.loads(prop.attrib['value'])
            if value['name'] == 'Page1.Button1':
                parents.remove(child)

iterator(props, nested=True)

Run Code Online (Sandbox Code Playgroud)

Answer 3

kit*_*.eb 5

您可以使用 xpath 来选择 Element 的父级。

file = open('test.xml', "r")
elem = ElementTree.parse(file)

namespace = "{http://somens}"

props = elem.findall('.//{0}prop'.format(namespace))
for prop in props:
    type = prop.get('type', None)
    if type == 'json':
        value = json.loads(prop.attrib['value'])
        if value['name'] == 'Page1.Button1':
            # Get parent and remove this prop
            parent = prop.find("..")
            parent.remove(prop)

Run Code Online (Sandbox Code Playgroud)

http://docs.python.org/2/library/xml.etree.elementtree.html#supported-xpath-syntax

除非你尝试它不起作用：http : //elmpowered.skawaii.net/?p=74

因此，您必须：

file = open('test.xml', "r")
elem = ElementTree.parse(file)

namespace = "{http://somens}"
search = './/{0}prop'.format(namespace)

# Use xpath to get all parents of props    
prop_parents = elem.findall(search + '/..')
for parent in prop_parents:
    # Still have to find and iterate through child props
    for prop in parent.findall(search):
        type = prop.get('type', None)
        if type == 'json':
            value = json.loads(prop.attrib['value'])
            if value['name'] == 'Page1.Button1':
                parent.remove(prop)

Run Code Online (Sandbox Code Playgroud)

它是两个搜索和一个嵌套循环。内部搜索仅针对已知包含 props 作为第一个孩子的元素，但这可能没有太大意义，具体取决于您的架构。

归档时间：	14 年，6 月前
查看次数：	39209 次
最近记录：	7 年，3 月前