使用 BeautifulSoup 获取某个 <td> 类

Question

使用 BeautifulSoup 获取某个 <td> 类

尝试编写一些代码，首先将玩家的姓名与他的工资请求相匹配。我能够编写它，以便通过从“sortcell”类中调用它来获取给定团队中每个球员的名字，但我似乎不知道如何获得薪水，因为他们都被称为 .

from bs4 import BeautifulSoup
from urllib import urlopen

teams = ['http://espn.go.com/nba/team/roster/_/name/atl/atlanta-hawks']

for team in teams:
    html = urlopen('' + team)
    soup = BeautifulSoup(html.read(), 'lxml')
    names = soup.findAll("td", {"class": "sortcell"})
    salary = soup.findAll("td", {"class": "td"})
    print(salary)
    for i in range(1, 15):
        name = names[i].get_text()
        print(name)

Run Code Online (Sandbox Code Playgroud)

您可以在以“薪水”开头的代码中看到我的（失败的）尝试。关于如何只获得薪资级别有什么想法吗？谢谢！

预期行为：

Salary 变量应该返回给定球员的工资，但目前不返回任何内容。

Answer 1

Mar*_*ers 8

你的salary列表是空的，因为<td>包含工资信息的元素没有 CSS 类；当然不是td。

names如果您从单元格导航到相应的工资单元格，您会更轻松；行中的最后一个：

for name in soup.find_all("td", class_="sortcell"):
    salary = name.parent.find_all('td')[-1]  # last cell in the row
    print(name.get_text())
    print(salary.get_text())

Run Code Online (Sandbox Code Playgroud)

我使用了soup.find_all()语法；findAll()是该方法的旧 BeautifulSoup 3 名称，已被弃用。

归档时间：	10 年，4 月前
查看次数：	14049 次
最近记录：	10 年，4 月前

使用 BeautifulSoup 获取某个 &lt;td&gt; 类

使用 BeautifulSoup 获取某个 <td> 类