在Python的ElementTree中提取标签之后的文本

Question

在Python的ElementTree中提取标签之后的文本

mae*_*mae 10 python text elementtree xml-parsing

这是XML的一部分:

<item><img src="cat.jpg" /> Picture of a cat</item>

Run Code Online (Sandbox Code Playgroud)

提取标签很简单.做就是了:

et = xml.etree.ElementTree.fromstring(our_xml_string)
img = et.find('img')

Run Code Online (Sandbox Code Playgroud)

但是如何在它之后立即获取文本(猫的图片)？执行以下操作将返回一个空字符串:

print et.text

Run Code Online (Sandbox Code Playgroud)

Answer 1

Cha*_*ffy 23

元素有一个tail属性 - 所以不是element.text,你要求的element.tail.

>>> import lxml.etree
>>> root = lxml.etree.fromstring('''<root><foo>bar</foo>baz</root>''')
>>> root[0]
<Element foo at 0x145a3c0>
>>> root[0].tail
'baz'

Run Code Online (Sandbox Code Playgroud)

或者,为您的例子:

>>> et = lxml.etree.fromstring('''<item><img src="cat.jpg" /> Picture of a cat</item>''')
>>> et.find('img').tail
' Picture of a cat'

Run Code Online (Sandbox Code Playgroud)

这也适用于简单的ElementTree:

>>> import xml.etree.ElementTree
>>> xml.etree.ElementTree.fromstring(
...   '''<item><img src="cat.jpg" /> Picture of a cat</item>'''
... ).find('img').tail
' Picture of a cat'

Run Code Online (Sandbox Code Playgroud)

辉煌.之前我尝试使用`.tail`,但我在我的_el_对象上使用它.没意识到我必须在*img*上使用它.谢谢你启发我! (4认同)

归档时间：	13 年，12 月前
查看次数：	12195 次
最近记录：	7 年，11 月前