Chu*_*uck 4 html python xml parsing elementtree
非常感谢您的阅读。对于这样一个初学者的问题,我深表歉意,因为我确信这是一个简单的答案。非常感谢任何指导。
\n\n我有一个正在解析的 xml 文件ElementTree
,其中的元素如下所示:
data.xml:\n<?xml version="1.0" encoding="utf-8"?><listings><listing id="26496000" dateFirstListed="2012-10-13" dateLastListed="2013-10-06" market="SALE" propertyType="DETACHED" bedrooms="4" latestAskingPrice="314950"><address key="u935d\xc2\xb70" udprn="50812465" line1="12 Millcroft" line2="Millhouse Green" town="SHEFFIELD" postcode="S36 9AR" /><description> SOME TEXT HERE </description></listing>\n
Run Code Online (Sandbox Code Playgroud)\n\n我想访问<description>
标签和<address key>
.
使用https://docs.python.org/2/library/xml.etree.elementtree.html中列出的指南我写道:
\n\nimport xml.etree.ElementTree\ndata = xml.etree.ElementTree.parse(\'data.xml\')\nroot = data.getroot()\n
Run Code Online (Sandbox Code Playgroud)\n\n并迭代子元素:
\n\nfor child in root:\n print child.tag, child.attrib\n>\nlisting {\'dateLastListed\': \'2013-10-06\', \'dateFirstListed\': \'2012-10-13\', \'propertyType\': \'DETACHED\', \'latestAskingPrice\': \'314950\', \'bedrooms\': \'4\', \'id\': \'26496000\', \'market\': \'SALE\'}\n
Run Code Online (Sandbox Code Playgroud)\n\n这只给我标签的子元素<listing>
。如何将上面的表达式更改为访问<address key>
and <description>
?
编辑:遵循此问题Parsing XML with Python - accessing elements 的指导
\n\n我尝试写:
\n\nfor i in root.findall("listing"):\n description = i.find(\'description\')\n print description.text\n\n >\n AttributeError: \'NoneType\' object has no attribute \'text\'\n
Run Code Online (Sandbox Code Playgroud)\n
您可以逐一迭代列表,然后获取内部元素description
和address
子元素。要访问属性,请使用.attrib
attribute:
import xml.etree.ElementTree as ET
data = ET.parse('data.xml')
root = data.getroot()
for listing in root.findall("listing"):
address = listing.find('address')
description = listing.findtext('description')
print(description, address.attrib.get("key"))
Run Code Online (Sandbox Code Playgroud)