名称错误“html”未用 beautifulsoup4 定义

Question

名称错误“html”未用 beautifulsoup4 定义

Tar*_*day 2 html python beautifulsoup python-3.x

我的 python 3.4.4 代码是：

import urllib.request
from bs4 import BeautifulSoup
from html.parser import HTMLParser

urls = 'file:///C:/Users/tarunuday/Documents/scrapdata/mech.html'
htmlfile = urllib.request.urlopen(urls)
soup = BeautifulSoup(htmlfile,html.parser)

Run Code Online (Sandbox Code Playgroud)

我收到这个错误

Traceback (most recent call last):
    File "C:\Python34\saved\scrapping\scrapping2.py", line 7, in <module>
    soup = BeautifulSoup(htmlfile,html.parser)
    NameError: name 'html' is not defined

Run Code Online (Sandbox Code Playgroud)

现在我明白 HTMLParser 是 py2.x 和 html.parser 是 py3.x 但我怎样才能让它工作？该BS4网站说If you get the ImportError “No module named html.parser”, your problem is that you’re running the Python 3 version of the code under Python 2.，但我跑3.x和我得到一个NameError不是一个ImportError

Answer 1

Dan*_*man 5

错误是正确的，您没有html在任何地方定义。您链接到的文档显示您应该"html.parser"作为字符串传递；看起来您根本不需要导入 HTMLParser。

归档时间：	9 年，8 月前
查看次数：	11462 次
最近记录：	5 年，1 月前