我想转换
由此
<b><i><u>Charming boutique selling trendy casual &amp; dressy apparel for women, some plus sized items, swimwear, shoes &amp; jewelry.</u></i></b>
Run Code Online (Sandbox Code Playgroud)
对此
Charming boutique selling trendy casual dressy apparel for women, some plus sized items, swimwear, shoes jewelry.
Run Code Online (Sandbox Code Playgroud)
我很困惑如何不仅删除特殊字符,还删除特殊字符之间的一些字母。谁能建议一种方法来做到这一点?
您可以使用htmlmodule andBeautifulSoup获取没有转义标签的文本:
s = "<b><i><u>Charming boutique selling trendy casual &amp; dressy apparel for women, some plus sized items, swimwear, shoes &amp; jewelry.</u></i></b>"
from bs4 import BeautifulSoup
from html import unescape
soup = BeautifulSoup(unescape(s), 'lxml')
print(soup.text)
Run Code Online (Sandbox Code Playgroud)
印刷:
Charming boutique selling trendy casual & dressy apparel for women, some plus sized items, swimwear, shoes & jewelry.
Run Code Online (Sandbox Code Playgroud)