Cod*_*din 3 python beautifulsoup python-idle
我正在学习如何使用BeautifulSoup的教程.我试图从我下载的html页面上的网址中删除名称.到目前为止,我的工作非常好.
from bs4 import BeautifulSoup
soup = BeautifulSoup(open("43rd-congress.html"))
final_link = soup.p.a
final_link.decompose()
links = soup.find_all('a')
for link in links:
print link
Run Code Online (Sandbox Code Playgroud)
但是当我进入下一部分
from bs4 import BeautifulSoup
soup = BeautifulSoup(open("43rd-congress.html"))
final_link = soup.p.a
final_link.decompose()
links = soup.find_all('a')
for link in links:
names = link.contents[0]
fullLink = link.get('href')
print names
print fullLink
Run Code Online (Sandbox Code Playgroud)
我收到这个错误
Traceback (most recent call last):
File "C:/Python27/python tutorials/soupexample.py", line 13, in <module>
print names
File "C:\Python27\lib\idlelib\PyShell.py", line 1325, in write
return self.shell.write(s, self.tags)
File "C:\Python27\lib\idlelib\rpc.py", line 595, in __call__
value = self.sockio.remotecall(self.oid, self.name, args, kwargs)
File "C:\Python27\lib\idlelib\rpc.py", line 210, in remotecall
seq = self.asynccall(oid, methodname, args, kwargs)
File "C:\Python27\lib\idlelib\rpc.py", line 225, in asynccall
self.putmessage((seq, request))
File "C:\Python27\lib\idlelib\rpc.py", line 324, in putmessage
s = pickle.dumps(message)
File "C:\Python27\lib\copy_reg.py", line 74, in _reduce_ex
getstate = self.__getstate__
RuntimeError: maximum recursion depth exceeded
Run Code Online (Sandbox Code Playgroud)
这是IDLE和BeautifulSoup NavigableString对象(子类unicode)之间的错误交互.见问题1757057 ; 它已经存在了一段时间.
解决方法是首先将对象转换为普通的unicode值:
print unicode(names)
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
1697 次 |
| 最近记录: |