Fel*_*ipe 8 python selenium beautifulsoup web-crawler web-scraping
我试图通过点击按钮获得网站给出的值.
这是网站:https://www.4devs.com.br/gerador_de_cpf
您可以看到有一个名为"Gerar CPF"的按钮,此按钮提供单击后显示的数字.
我当前的脚本打开浏览器并获取值,但是我在点击之前从页面获取值,因此值为空.我想知道点击按钮后是否可以获取值.
from selenium import webdriver
from bs4 import BeautifulSoup
from requests import get
url = "https://www.4devs.com.br/gerador_de_cpf"
def open_browser():
driver = webdriver.Chrome("/home/felipe/Downloads/chromedriver")
driver.get(url)
driver.find_element_by_id('bt_gerar_cpf').click()
def get_cpf():
response = get(url)
page_with_cpf = BeautifulSoup(response.text, 'html.parser')
cpf = page_with_cpf.find("div", {"id": "texto_cpf"}).text
print("The value is: " + cpf)
open_browser()
get_cpf()
Run Code Online (Sandbox Code Playgroud)
您不需要使用 requests 和 BeautifulSoup。
from selenium import webdriver
from time import sleep
url = "https://www.4devs.com.br/gerador_de_cpf"
def get_cpf():
driver = webdriver.Chrome("/home/felipe/Downloads/chromedriver")
driver.get(url)
driver.find_element_by_id('bt_gerar_cpf').click()
sleep(10)
text=driver.find_element_by_id('texto_cpf').text
print(text)
get_cpf()
Run Code Online (Sandbox Code Playgroud)
open_browser和get_cpf绝对没有关系......
其实你根本不需要get_cpf.单击按钮后等待文本:
from selenium.webdriver.support.ui import WebDriverWait as wait
def open_browser():
driver = webdriver.Chrome("/home/felipe/Downloads/chromedriver")
driver.get(url)
driver.find_element_by_id('bt_gerar_cpf').click()
text_field = driver.find_element_by_id('texto_cpf')
text = wait(driver, 10).until(lambda driver: not text_field.text == 'Gerando...' and text_field.text)
return text
print(open_browser())
Run Code Online (Sandbox Code Playgroud)
更新
请求相同:
import requests
url = 'https://www.4devs.com.br/ferramentas_online.php'
data = {'acao': 'gerar_cpf', 'pontuacao': 'S'}
response = requests.post(url, data=data)
print(response.text)
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
1779 次 |
| 最近记录: |