elementtree:获取xml文档中特定标签的内容

ale*_*e19 4 python elementtree

我正在尝试提取 XML 文件中特定标签的内容。

XML 示例:

<facts>
        <fact>
            <name>crash</name>
            <full_name>Crash</full_name>
            <variables>
                <variable>
                    <name>id</name>
                    <proper_name>Crash Instance</proper_name>
                    <type>INT</type>
                    <interpretation>key</interpretation>
                </variable>
                <variable>
                    <name>accident_key</name>
                    <proper_name>Case Identifier</proper_name>
                    <interpretation>string</interpretation>
                    <type>CHAR(9)</type>
                </variable>
                <variable>
                    <name>accident_year</name>
                    <proper_name>Crash Year</proper_name>
                    <interpretation>dim</interpretation>
                    <type>INT</type>
                </variable>
            </variables>
        </fact>
    <fact>
        <name>vehicle</name>
        <full_name>Vehicle</full_name>
        <variables>
            <variable>
                <name>id</name>
                <proper_name>Vehicle Instance</proper_name>
                <type>INT</type>
            </variable>
            <variable>
                <name>crash_id</name>
                    <proper_name>Crash Instance</proper_name>
                <type>INT</type>
            </variable>
        </variables>
    </fact>
</facts>
Run Code Online (Sandbox Code Playgroud)

我想从节点中提取标签的所有内容,但仅限于崩溃事实。

到目前为止,这是我的代码。

def header(filename, fact):    
    lst = []
    tree = ET.parse(filename) #read in the XML
    for fact in tree.iter(tag = 'fact'):
        factname = fact.find('name').text
        if factname == fact: #choose the fact to pull from
            for var in fact.iter(tag = 'variable'):
                name = var.find('name').text
                lst.append(name)
     return lst #return a list of all the <name> tags from the Crash fact

newlst = header('schema.xml','crash')
Run Code Online (Sandbox Code Playgroud)

我的输出 newlst 应该是崩溃事实中所有标签的列表。但它总是返回空。

奇怪的是,如果我对所有内容进行硬编码(并删除该函数),它会返回正确的输出:

lst = []
tree = ET.parse('schema.xml')
for fact in tree.iter(tag = 'fact'):
    factname = fact.find('name').text
    if factname == 'crash': 
        for var in fact.iter(tag = 'variable'):
            name = var.find('name').text
            lst.append(name)
 print(lst)


 Output: ['id',
 'accident_key',
 'accident_year']
Run Code Online (Sandbox Code Playgroud)

Apa*_*ala 5

在函数中,您使用变量fact作为参数和第一个for循环的变量。试试这个版本:

def header(filename, target_factname):    
    lst = []
    tree = ET.parse(filename) #read in the XML
    for fact in tree.iter(tag = 'fact'):
        factname = fact.find('name').text
        if factname == target_factname: #choose the fact to pull from
            for var in fact.iter(tag = 'variable'):
                name = var.find('name').text
                lst.append(name)
     return lst #return a list of all the <name> tags from the Crash fact
Run Code Online (Sandbox Code Playgroud)