BeautifulSoup仅解析一列而不是Python中的整个Wikipedia表

Question

BeautifulSoup仅解析一列而不是Python中的整个Wikipedia表

big*_*019 2 html wikipedia beautifulsoup html-parsing python-3.x

我正在尝试使用Python中的BeautifulSoup 解析这里的第一个表。它解析了我的第一列，但是由于某种原因，它没有解析整个表。任何帮助表示赞赏！

注意：我正在尝试解析整个表并将其转换为pandas数据框

我的代码：

import requests
from bs4 import BeautifulSoup

WIKI_URL = requests.get("https://en.wikipedia.org/wiki/NCAA_Division_I_FBS_football_win-loss_records").text
soup = BeautifulSoup(WIKI_URL, features="lxml")
print(soup.prettify())
my_table = soup.find('table',{'class':'wikitable sortable'})
links=my_table.findAll('a')
print(links)

Run Code Online (Sandbox Code Playgroud)

Answer 1

B.A*_*ler 5

它仅解析一个列，因为您仅对第一列中的项目执行了findall。要解析整个表，您必须对表行执行findall <tr>，然后对表的每行进行findall分割<td>。现在，您只是为链接做一个findall，然后打印链接。

my_table = soup.find('table',{'class':'wikitable sortable'})
for row in mytable.findAll('tr'):
    print(','.join([td.get_text(strip=True) for td in row.findAll('td')]))

Run Code Online (Sandbox Code Playgroud)

归档时间：	6 年，12 月前
查看次数：	71 次
最近记录：	6 年，12 月前