小编Kam*_*ish的帖子

现在尝试使用 BeautifulSoup 和 Python 3 从类中提取“href”。

我似乎无法让它发挥作用。我有我的脚本转到一个站点并将数据抓取到我的变量中，但是当我尝试从我得到的特定类中info提取数据时，或者当我尝试各种不同的组合时它不起作用。我哪里搞砸了？当我将它刮到我的信息变量中时，它的内部有一个和。hrefNoneclass='business-name'href

import requests
from bs4 import BeautifulSoup

count = 0
search_terms = "Bars"
location = "New Orleans, LA"
url = "https://www.yellowpages.com/search"
q = {'search_terms': search_terms, 'geo_location_terms': location}
page = requests.get(url, params=q)
url_link = page.url
page_num = str(count)
searched_page = url_link + '&page=' + str(count)
page = requests.get(searched_page)
soup = BeautifulSoup(page.text, 'html.parser')
info = soup.findAll('div', {'class': 'info'})
for each_business in info:
    # This is the spot that is broken. I can't make it work! 
    yp_bus_url …

Run Code Online (Sandbox Code Playgroud)

python beautifulsoup

Kam*_*ish

lucky-day

5
推荐指数

1
解决办法

4880
查看次数

如果类存在，Beautifulsoup

有没有办法让 BeautifulSoup 寻找一个类，如果它存在然后运行脚本？我正在尝试这个：

if soup.find_all("div", {"class": "info"}) == True:
    print("Tag Found")

Run Code Online (Sandbox Code Playgroud)

我也试过，但它没有用，并给出了一个关于属性太多的错误：

if soup.has_attr("div", {"class": "info"})
    print("Tag Found")

Run Code Online (Sandbox Code Playgroud)

python if-statement beautifulsoup

Kam*_*ish

lucky-day

3
推荐指数

1
解决办法

9465
查看次数

Python 3.5 urllib.request 403禁止错误

import urllib.request
import urllib
from bs4 import BeautifulSoup


url = "https://www.brightscope.com/ratings"
page = urllib.request.urlopen(url)
soup = BeautifulSoup(page, "html.parser")

print(soup.title)

Run Code Online (Sandbox Code Playgroud)

我试图去上述网站，代码不断吐出403禁止错误。

有任何想法吗？

C：\ Users \ jerem \ AppData \ Local \ Programs \ Python \ Python35-32 \ python.exe“ C：/ Users / jerem / PycharmProjects / webscraper / url scraper.py”追溯（最近一次调用）：文件“ C ：/ Users / jerem / PycharmProjects / webscraper / url scraper.py”，第7行，页面= urllib.request.urlopen（url）文件“ C：\ Users \ jerem \ AppData \ Local \ Programs \ Python \ Python35-32 \ lib \ urllib \ …

urllib beautifulsoup http-status-code-403 python-3.x

Kam*_*ish

lucky-day

2
推荐指数

1
解决办法

7062
查看次数

标签统计

beautifulsoup ×3

python ×2

http-status-code-403 ×1

if-statement ×1

python-3.x ×1

urllib ×1

现在尝试使用 BeautifulSoup 和 Python 3 从类中提取“href”。

如果类存在，Beautifulsoup

Python 3.5 urllib.request 403禁止错误

标签 统计

小编Kam_ish的帖子

标签统计