我需要加载XML文件并将内容转换为面向对象的Python结构.我想接受这个:
<main>
    <object1 attr="name">content</object>
</main>
把它变成这样的东西:
main
main.object1 = "content"
main.object1.attr = "name"
XML数据将具有比这更复杂的结构,我不能硬编码元素名称.解析时需要收集属性名称并将其用作对象属性.
如何将XML数据转换为Python对象?
Pet*_*ann 48
值得一看lxml.objectify.
xml = """<main>
<object1 attr="name">content</object1>
<object1 attr="foo">contenbar</object1>
<test>me</test>
</main>"""
from lxml import objectify
main = objectify.fromstring(xml)
main.object1[0]             # content
main.object1[1]             # contenbar
main.object1[0].get("attr") # name
main.test                   # me
或者反过来构建xml结构:
item = objectify.Element("item")
item.title = "Best of python"
item.price = 17.98
item.price.set("currency", "EUR")
order = objectify.Element("order")
order.append(item)
order.item.quantity = 3
order.price = sum(item.price * item.quantity for item in order.item)
import lxml.etree
print(lxml.etree.tostring(order, pretty_print=True))
输出:
<order>
  <item>
    <title>Best of python</title>
    <price currency="EUR">17.98</price>
    <quantity>3</quantity>
  </item>
  <price>53.94</price>
</order>
我今天不止一次推荐这个,但尝试Beautiful Soup(easy_install BeautifulSoup).
from BeautifulSoup import BeautifulSoup
xml = """
<main>
    <object attr="name">content</object>
</main>
"""
soup = BeautifulSoup(xml)
# look in the main node for object's with attr=name, optionally look up attrs with regex
my_objects = soup.main.findAll("object", attrs={'attr':'name'})
for my_object in my_objects:
    # this will print a list of the contents of the tag
    print my_object.contents
    # if only text is inside the tag you can use this
    # print tag.string