小编Mir*_*ach的帖子

Beautifulsoup html解析会损坏<link>标记

我正在使用漂亮的汤从rss页面解析html代码。如何保存链接标签？

该代码最有前途的代码是：

python
import urllib.request, urllib.parse, urllib.error 
from bs4 import BeautifulSoup
url = 'https://advisories.ncsc.nl/rss/advisories'
uh = urllib.request.urlopen(url)
html_doc= uh.read()
soup = BeautifulSoup(html_doc, 'html.parser')

Run Code Online (Sandbox Code Playgroud)

我尝试import lxml将代码切换到， python soup = BeautifulSoup(html_doc, 'xml') 但这给了我一个错误：

ModuleNotFoundError: No module named 'lxml'

Run Code Online (Sandbox Code Playgroud)

我希望结果是， <link>https://someurl.org</link>但输出是<link/>someurl.org

python beautifulsoup xml-parsing

Mir*_*ach

2019 07-21

5
推荐指数

1
解决办法

47
查看次数

标签统计

beautifulsoup ×1

python ×1

xml-parsing ×1

Beautifulsoup html解析会损坏&lt;link&gt;标记

标签 统计

小编Mir_ach的帖子

Beautifulsoup html解析会损坏<link>标记

标签统计