python UnicodeEncodeError>如何简单地删除令人不安的unicode字符?

Nul*_*oet 6 python unicode parsing html-parsing

继承人我做了什么..

>>> soup = BeautifulSoup (html)
>>> soup
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
UnicodeEncodeError: 'ascii' codec can't encode character u'\xae' in position 96953: ordinal not in range(128)
>>> 
>>> soup.find('div')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
UnicodeEncodeError: 'ascii' codec can't encode character u'\xae' in position 11035: ordinal not in range(128)
>>> 
>>> soup.find('span')
<span id="navLogoPrimary" class="navSprite"><span>amazon.com</span></span>
>>> 
Run Code Online (Sandbox Code Playgroud)

如何从中删除令人不安的unicode字符html
或者有更清洁的解决方案吗?

小智 10

试试这种方式: soup = BeautifulSoup (html.decode('utf-8', 'ignore'))