Ple*_*Guy 9 python xml parsing lxml
*注意:lxml将无法在我的系统上运行.我希望找到一个不涉及lxml的解决方案.
我已经在这里浏览了一些文档,并且我很难按照我的意愿开始工作.我想解析一些看起来像这样的XML文件:
<dict>
<key>1375</key>
<dict>
<key>Key 1</key><integer>1375</integer>
<key>Key 2</key><string>Some String</string>
<key>Key 3</key><string>Another string</string>
<key>Key 4</key><string>Yet another string</string>
<key>Key 5</key><string>Strings anyone?</string>
</dict>
</dict>
Run Code Online (Sandbox Code Playgroud)
在我试图操纵的文件中,有更多的"dict"跟随这个.我想通读XML并输出一个如下所示的text/dat文件:
1375,"Some String","Another String","又一个字符串","Strings any?"
...
EOF
**最初,我尝试使用lxml,但经过多次尝试让它在我的系统上工作,我继续使用DOM.最近,我尝试使用Etree来完成这项任务.对于所有善良的爱,请有人帮助我吗?我是Python的新手,想了解它是如何工作的.我提前谢谢你.
Joh*_*hin 10
您可以使用Python附带的xml.etree.ElementTree.有一个包括伴侣C实现(即更快)xml.etree.cElementTree
.lxml.etree
提供功能的超集,但不需要你想做的事情.
@Acorn提供的代码对我(Python 2.7,Windows 7)的作用与以下每个导入相同:
import xml.etree.ElementTree as et
import xml.etree.cElementTree as et
import lxml.etree as et
...
tree = et.fromstring(xmltext)
...
Run Code Online (Sandbox Code Playgroud)
您使用的是什么操作系统以及您遇到了哪些安装问题lxml
?
import xml.etree.ElementTree as et
import csv
xmltext = """
<dicts>
<key>1375</key>
<dict>
<key>Key 1</key><integer>1375</integer>
<key>Key 2</key><string>Some String</string>
<key>Key 3</key><string>Another string</string>
<key>Key 4</key><string>Yet another string</string>
<key>Key 5</key><string>Strings anyone?</string>
</dict>
</dicts>
"""
f = open('output.txt', 'w')
writer = csv.writer(f, quoting=csv.QUOTE_NONNUMERIC)
tree = et.fromstring(xmltext)
# iterate over the dict elements
for dict_el in tree.iterfind('dict'):
data = []
# get the text contents of each non-key element
for el in dict_el:
if el.tag == 'string':
data.append(el.text)
# if it's an integer element convert to int so csv wont quote it
elif el.tag == 'integer':
data.append(int(el.text))
writer.writerow(data)
Run Code Online (Sandbox Code Playgroud)