Noo*_*tor 1 python xml elementtree xml-parsing
我的API应该采用字符串并将其转换为XML格式.
但我一直得到这个错误:
ParseError:标记不匹配:第1行,第764行
XML
<?xml version="1.0" encoding="utf-8" ?>
<MasterDetails IssuerId="5" Version="12.2">
<XMLRequest />
<BookingDetails Amount="768" Comment="Hotel Travel Purchase" CurrencyCode="INR" PurchaseType="Hotel" SupplierName="SomeHotel" CardAlias="C_ALIAS" ValidFor="-1D" CurrencyType="B" />
<CDFs>
<CDF FieldName="Order Date" FieldValue="2015-01-01" />
</CDFs>
<SomeTag>
<Rule Action="A" Alias="MyAlias">
<Controls>
<OPMCCControl Negate="False"/>
<OPMIDControl />
<SomeControlsTags CumulativeLimit="768" MaxTrans="None" Period="C" />
<ValidityPeriod ValidFrom="2015-01-01 00:00:00.0 +0000" ValidTo="2015-01-11 00:00:00.0 +0000" />
</Controls>
</Rule>
</SomeTag>
</BookingDetails>
<Email EmailAddress="T@J.COM"/>
<MasterDetails />
Run Code Online (Sandbox Code Playgroud)
实施过:
tree = ET.ElementTree(ET.fromstring(kk.strip()))
Run Code Online (Sandbox Code Playgroud)
我肯定知道我的XML字符串包含所有匹配的标签并且已经格式化但是仍然可能在我的眼前面缺少某些东西!
该BookingDetails标签是自我封闭在这条线:
<BookingDetails Amount="768" Comment="Hotel Travel Purchase" CurrencyCode="INR" PurchaseType="Hotel" SupplierName="SomeHotel" CardAlias="C_ALIAS" ValidFor="-1D" CurrencyType="B" />
Run Code Online (Sandbox Code Playgroud)
但是当有一个单独的结束BookingDetails元素时:
</BookingDetails>
Run Code Online (Sandbox Code Playgroud)
此外,<MasterDetails />最后一行没有正确关闭.应该是</MasterDetails>而不是<MasterDetails />.
请注意,如果使用,您可以在"恢复"模式下解析此XML lxml.etree:
import lxml.etree as ET
parser = ET.XMLParser(recover=True)
tree = ET.ElementTree(ET.fromstring(data, parser=parser))
Run Code Online (Sandbox Code Playgroud)
或者,使用BeautifulSoup与xml功能:
from bs4 import BeautifulSoup
soup = BeautifulSoup(data, "xml")
print(soup.prettify())
Run Code Online (Sandbox Code Playgroud)