gol*_*ine 4 html python mechanize beautifulsoup
我将.get_data()方法与机械化配合使用,该方法似乎可以打印出我想要的html。我还要检查打印输出的类型,类型为“ str”。
但是,当我尝试使用BeautifulSoup解析str时,出现以下错误:
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-163-11c061bf6c04> in <module>()
7 html = get_html(first[i],last[i])
8 print type(html)
----> 9 print parse_page(html)
10 # l_to_store.append(parse_page(html))
11 # hfb_data['l_to_store']=l_to_store
<ipython-input-161-bedc1ba19b10> in parse_hfb_page(html)
3 parse html to extract info in connection with a particular person
4 '''
----> 5 soup = BeautifulSoup(html)
6 for el in soup.find_all('li'):
7 if el.find('span').contents[0]=='Item:':
TypeError: 'module' object is not callable
Run Code Online (Sandbox Code Playgroud)
“模块”到底是什么?如何将get_data()返回到html中?
当您像这样导入BeatufilulSoup时:
import BeautifulSoup
Run Code Online (Sandbox Code Playgroud)
您正在导入包含类,函数等的模块。为了从BeautifulSoup模块实例化BeautifulSoup类实例,您需要导入它或使用全名,包括模块前缀(如yonili在上面的注释中建议的那样):
from BeautifulSoup import BeautifulSoup
soup = BeautifulSoup(html)
Run Code Online (Sandbox Code Playgroud)
要么
import BeautifulSoup
soup = BeautifulSoup.BeautifulSoup(html)
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
5596 次 |
| 最近记录: |