用逗号和漂亮的汤代替逗号

Owe*_*wen 1 html python beautifulsoup

在此输入图像描述

Html代码行如上.

我已经设法从这个网址获得它

import requests
from bs4 import BeautifulSoup as soup

url = 'https://www.saa.gov.uk/search/?SEARCHED=1&ST=&SEARCH_TERM=city+of+edinburgh%2C+BOSWALL+PARKWAY%2C+EDINBURGH&ASSESSOR_ID=&SEARCH_TABLE=valuation_roll_cpsplit&DISPLAY_COUNT=10&TYPE_FLAG=CP&ORDER_BY=PROPERTY_ADDRESS&H_ORDER_BY=SET+DESC&DRILL_SEARCH_TERM=BOSWALL+PARKWAY%2C+EDINBURGH&DD_TOWN=EDINBURGH&DD_STREET=BOSWALL+PARKWAY&UARN=110B60329&PPRN=000000000001745&ASSESSOR_IDX=10&DISPLAY_MODE=FULL#results'

baseurl = 'https://www.saa.gov.uk'
session = requests.session()
response = session.get(url)

# content of search page in soup 
html = soup(response.content,"lxml")

Address = LeftBlockData[3].get_text().strip()
print (Address)
Run Code Online (Sandbox Code Playgroud)

然而它打印像这样 '29 BOSWALL PARKWAYEDINBURGHEH5 2BR'

那里是是<br />文本之间的一个替代'no space'.

我想在目前的地方放一个逗号<br />.

请问有人可以推荐一种方法吗?

Zro*_*roq 6

获取节点文本时,可以设置分隔符.

from bs4 import BeautifulSoup

example = """<td rowspan="1">29 BOSWALL PARKWAY<br />EDINBURGH<br />EHS 2BR</td>"""

soup = BeautifulSoup(example, "xml")

print(soup.find("td").get_text(strip=True, separator=','))
Run Code Online (Sandbox Code Playgroud)

输出:

29 BOSWALL PARKWAY,EDINBURGH,EHS 2BR