Beautifulsoup 将 colspan=2 替换为单列

Eri*_*ins 1 beautifulsoup

我正在尝试解析偶尔具有 colspan=2 的行中的数据,这会破坏我提取目标数据的能力。我想做的是每次出现时从表元素中删除“colspan=2”:

#replace
<td colspan="2" class="time">10:00 AM</td>
#with
<td>635</td>
Run Code Online (Sandbox Code Playgroud)

这可能吗?我可以将其转化为条件 if then else 吗?

这是一个更详细的示例:

<table>
<tr class="playerRow even">
<td class="pos">1</td>
<td><span class="rank"></span> -</td>
<td class="player"><p class="playerName">John doe</p></td>
<td class="background">X</td>
<td>345</td> #THIS ELEMENT FREQUENT
<td></td>
<td></td>
<td></td>
<td></td>
<td style=""></td>
</tr><

<tr class="playerRow odd">
<td class="pos">1</td>
<td><span class="rank"></span> -</td>
<td class="player"><p class="playerName">John doe</p></td>
<td class="background">X</td>
<td colspan="2" class="myClass" style="">3:15 PM</td> #THIS ELEMENT OCCASIONAL
<td></td>
<td></td>
<td></td>
<td></td>
<td style=""></td>
</tr>

<tr class="playerRow odd">
<td class="pos">1</td>
<td><span class="rank"></span> -</td>
<td class="player"><p class="playerName">John doe</p></td>
<td class="background">X</td>
<td>22</td> #THIS ELEMENT FREQUENT
<td></td>
<td></td>
<td></td>
<td></td>
<td style=""></td>
</tr>
</table>
Run Code Online (Sandbox Code Playgroud)

因此,每当我遇到 colspan 时,我都想将其替换为普通的 td,这样它就不会分流行元素并弄乱我的计数。

sca*_*an_ 5

这将转换:

<td colspan="2" class="myClass" style="">3:15 PM</td>

到:

<td>3:15 PM</td>

from bs4 import BeautifulSoup

bs = BeautifulSoup(html)

for x in bs.findAll("td"):
    if "colspan" in x.attrs:
        x.attrs = {}
Run Code Online (Sandbox Code Playgroud)

您是否希望它也删除该值?