pup*_*eno 10 python xml api elementtree
也就是说,所有文本和子标签,没有元素本身的标签?
有
<p>blah <b>bleh</b> blih</p>
Run Code Online (Sandbox Code Playgroud)
我想要
blah <b>bleh</b> blih
Run Code Online (Sandbox Code Playgroud)
element.text返回"blah",etree.tostring(element)返回:
<p>blah <b>bleh</b> blih</p>
Run Code Online (Sandbox Code Playgroud)
S.L*_*ott 11
ElementTree的作品完美,你得自己装配的答案.像这样......
"".join( [ "" if t.text is None else t.text ] + [ xml.tostring(e) for e in t.getchildren() ] )
Run Code Online (Sandbox Code Playgroud)
感谢JV amd PEZ指出错误.
编辑.
>>> import xml.etree.ElementTree as xml
>>> s= '<p>blah <b>bleh</b> blih</p>\n'
>>> t=xml.fromstring(s)
>>> "".join( [ t.text ] + [ xml.tostring(e) for e in t.getchildren() ] )
'blah <b>bleh</b> blih'
>>>
Run Code Online (Sandbox Code Playgroud)
不需要尾巴.
这是我最终使用的解决方案:
def element_to_string(element):
s = element.text or ""
for sub_element in element:
s += etree.tostring(sub_element)
s += element.tail
return s
Run Code Online (Sandbox Code Playgroud)