Lar*_* M. 4 python lxml html-xml-utils
我正在尝试使用 lxml 解析本地 HTML,但出现错误,但我不知道为什么(对于错误的代码提前抱歉,我是新手)。
from lxml import etree, html
from StringIO import StringIO
parser = etree.HTMLParser()
doc = etree.parse(StringIO("test1.html"), parser)
tree = html.fromstring(doc)
CCE = tree.xpath('//div[@data-reactid]/div[@class="browse-summary"]/h1')
URL = tree.xpath('//a[@class="rc-OfferingCard"]/@href')
print 'CCE:', CCE
print 'URL:', URL
Run Code Online (Sandbox Code Playgroud)
这是错误:
File "test.py", line 8, in <module>
tree = html.fromstring(doc)
File "/usr/lib/python2.7/dist-packages/lxml/html/__init__.py", line 703, in fromstring
is_full_html = _looks_like_full_html_unicode(html)
TypeError: expected string or buffer
Run Code Online (Sandbox Code Playgroud)
我想你需要
tree = etree.parse("text1.html", parser)
Run Code Online (Sandbox Code Playgroud)
没有StringIO和fromstring
| 归档时间: |
|
| 查看次数: |
2978 次 |
| 最近记录: |