Nul*_*oet 6 python unicode parsing html-parsing
继承人我做了什么..
>>> soup = BeautifulSoup (html)
>>> soup
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
UnicodeEncodeError: 'ascii' codec can't encode character u'\xae' in position 96953: ordinal not in range(128)
>>>
>>> soup.find('div')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
UnicodeEncodeError: 'ascii' codec can't encode character u'\xae' in position 11035: ordinal not in range(128)
>>>
>>> soup.find('span')
<span id="navLogoPrimary" class="navSprite"><span>amazon.com</span></span>
>>>
Run Code Online (Sandbox Code Playgroud)
如何从中删除令人不安的unicode字符html?
或者有更清洁的解决方案吗?
| 归档时间: |
|
| 查看次数: |
18987 次 |
| 最近记录: |