小编Kim*_*ung的帖子

如何使用Python BeautifulSoup将输出写入html文件

我通过删除一些标签修改了一个html文件beautifulsoup.现在我想将结果写回html文件中.我的代码:

from bs4 import BeautifulSoup
from bs4 import Comment

soup = BeautifulSoup(open('1.html'),"html.parser")

[x.extract() for x in soup.find_all('script')]
[x.extract() for x in soup.find_all('style')]
[x.extract() for x in soup.find_all('meta')]
[x.extract() for x in soup.find_all('noscript')]
[x.extract() for x in soup.find_all(text=lambda text:isinstance(text, Comment))]
html =soup.contents
for i in html:
    print i

html = soup.prettify("utf-8")
with open("output1.html", "wb") as file:
    file.write(html)
Run Code Online (Sandbox Code Playgroud)

由于我使用了soup.prettify,它会生成如下的html:

<p>
    <strong>
     BATAM.TRIBUNNEWS.COM, BINTAN
    </strong>
    - Tradisi pedang pora mewarnai serah terima jabatan pejabat di
    <a href="http://batam.tribunnews.com/tag/polres/" title="Polres">
     Polres
    </a>
    <a …
Run Code Online (Sandbox Code Playgroud)

html python beautifulsoup

28
推荐指数
3
解决办法
3万
查看次数

标签 统计

beautifulsoup ×1

html ×1

python ×1