我正在尝试从网页上抓取所有链接。我正在使用Selenium WebDriver滚动并单击网页中存在的“加载更多”按钮。我正在尝试的代码如下所示:
from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.common.exceptions import ElementNotVisibleException
from selenium.webdriver.support import expected_conditions as EC
from selenium.common.exceptions import NoSuchElementException
from bs4 import BeautifulSoup
def fetch_links(url):
chrome_path = r"D:\nishant_pc_d_drive\nishant_pc\d_drive\update_engine\myntra_update\chromedriver.exe"
driver = webdriver.Chrome(chrome_path)
driver.get(url)
while True:
try:
scrollcount=1
while scrollcount<5:
driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")
WebDriverWait(driver, 5)
scrollcount+=1
WebDriverWait(driver, 10).until(EC.presence_of_element_located(driver.find_elements_by_css_selector('.load_more .sbt-button, .load_more_order .sbt-button')))
driver.find_element_by_id("loadmore").click()
except (ElementNotVisibleException,NoSuchElementException) as e:
print "done"
x = driver.page_source
soup2 = BeautifulSoup(x, 'html.parser')
linkcount=0
for each in soup2.find_all('a',attrs={"class":"thumb searchUrlClass"}):
print "https://www.shoppersstop.com/"+each.get('href')
linkcount+=1
print linkcount
# thumb …Run Code Online (Sandbox Code Playgroud) 我在 mongoDB 中有几个文档,文档结构是这样的
{
"a":"abc",
"myid":2
}
Run Code Online (Sandbox Code Playgroud)
我想用 1 更新所有文档的“myid”。例如,第一个文档 myid = 1,第二个文档 myid =2,依此类推。有询问吗?
给出错误
Traceback (most recent call last):
File "p3.py", line 21, in <module>
WebDriverWait(driver, timex).until(EC.presence_of_element_located(by, element))
TypeError: __init__() takes exactly 2 arguments (3 given)
Run Code Online (Sandbox Code Playgroud)
我没有用过__init__()为什么会出现这个错误?
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
import time
chrome_path=r"C:\Users\Bhanwar\Desktop\New folder (2)\chromedriver.exe"
driver =webdriver.Chrome(chrome_path)
driver.get("https://priceraja.com/mobile/pricelist/samsung-mobile-price-list-in-india")
#driver.implicitly_wait(10)
i=0
timex = 5
by = By.ID
hook = "product-itmes-" # The id of one item, they seems to be this plus
# …Run Code Online (Sandbox Code Playgroud) 我在 mongoDB 中有一个数据,我想"category"使用 python 代码检索一个键的所有值。我尝试了几种方法,但在每种情况下我都必须给出要检索的“值”。任何建议,将不胜感激。
{
id = "my_id1"
tags: [tag1, tag2, tag3],
category: "movie",
},
{
id = "my_id2"
tags: [tag3, tag6, tag9],
category: "tv",
},
{
id = "my_id3"
tags: [tag2, tag6, tag8],
category: "movie",
}
Run Code Online (Sandbox Code Playgroud)
我希望输出为
category: "movie"
category: "tv"
category: "movie"
Run Code Online (Sandbox Code Playgroud)