相关疑难解决方法(0)

当使用 Selenium 和 Python 传递值时，动态下拉列表不会在 https://www.nseindia.com/ 上填充自动建议

driver = webdriver.Chrome('C:/Workspace/Development/chromedriver.exe')
driver.get('https://www.nseindia.com/companies-listing/corporate-filings-actions')
inputbox = driver.find_element_by_xpath('/html/body/div[7]/div[1]/div/section/div/div/div/div/div/div[1]/div[1]/div[1]/div/span/input[2]')
inputbox.send_keys("Reliance")

Run Code Online (Sandbox Code Playgroud)

我正在尝试从该网站上抓取表格，该表格会在您在其上方的文本字段中输入公司名称后出现。附加的代码块与普通 google 搜索和 wolfram 网站的此类类似下拉菜单配合良好，但是当我在所需网站上运行我的脚本时，基本上只是在文本字段中输入所需文本 - 下拉列表显示“未找到记录” '，而手动完成后效果很好。

python selenium akamai autosuggest web-scraping

Pra*_*ari

2020 06-19

3
推荐指数

1
解决办法

1123
查看次数

在 Linux 上使用无头 Chrome 访问被拒绝的页面，而有头 Chrome 在 Windows 上使用 Selenium 通过 Python 工作

我有我在本地机器上使用的代码：

from selenium import webdriver
chrom_path = r"C:\Users\user\sof\chromedriver_win32\chromedriver.exe"
driver = webdriver.Chrome(chrom_path)
link = 'https://www.google.com/'
driver.get(link)
s = driver.page_source
print((s.encode("utf-8")))
driver.quit()

Run Code Online (Sandbox Code Playgroud)

并且此代码返回该网站的页面源，但是当我在Linux服务器centos7上使用此代码时：

from selenium import webdriver

options = webdriver.ChromeOptions()
options.add_argument('--headless')
options.add_argument('--disable-gpu')
options.add_argument('--no-sandbox')
driver = webdriver.Chrome(executable_path="/usr/local/bin/chromedriver", chrome_options=options)
driver.get("https://www.google.com")
s = driver.page_source
print((s.encode("utf-8")))
driver.quit()

Run Code Online (Sandbox Code Playgroud)

这段代码也应该返回页面源代码，但这段代码返回：

b'<html><head>\n<title>Access Denied</title>\n</head><body>\n<h1>Access Denied</h1>\n \nYou don\'t have permission to access "http://www.newark.com/" on this server.<p>\nReference #18.456cd417.1576243477.e007b9f\n\n\n</p></body></html>'

Run Code Online (Sandbox Code Playgroud)

有人知道为什么相同的代码在不同的操作系统上的工作方式不同吗？

python selenium user-agent selenium-chromedriver google-chrome-headless

作者

2019 12-14

1
推荐指数

1
解决办法

3970
查看次数

无法从网页中获取不同职位的标题

我已经使用 selenium 在 python 中编写了一个脚本来获取从网页遍历多个页面的不同作业的标题。当我运行脚本时，我可以注意到 selenium 无法打开该网页。但是，我可以在 Internet Explorer 或 Chrome 中手动使用该链接轻松查看该页面的内容。

网页链接 #如果看不到内容，请务必刷新页面

我试过：

from bs4 import BeautifulSoup
from selenium import webdriver

URL = 'https://www.alljobs.co.il/SearchResultsGuest.aspx?page=1&position=235,330,320,236,1541&type=&city=&region='

with webdriver.Chrome() as driver:
    driver.get(URL)
    soup = BeautifulSoup(driver.page_source,'lxml')

    while True:
        for item in soup.select('[class="job-content-top"]'):
            title = item.select_one('.job-content-top-title a[title')
            print(title)

        try:
            next_page = driver.find_elemeny_by_css_selector('.jobs-paging-next > a').click()
            soup = BeautifulSoup(driver.page_source,'lxml')
        except Exception:
            break

Run Code Online (Sandbox Code Playgroud)

我什至这样尝试过，但这也不起作用（从浏览器收集的 cookie）：

from bs4 import BeautifulSoup
from selenium import webdriver

URL = 'https://www.alljobs.co.il/SearchResultsGuest.aspx?page=1&position=235,330,320,236,1541&type=&city=&region='

cookie = "_ga=GA1.3.1765365490.1582505881; _gid=GA1.3.568643527.1582505881; _fbp=fb.2.1582505881473.1930545410; _hjid=619e3a88-ee5a-43ca-8a0b-e70b063dcf84; BlockerDisplay=; DiplayPopUpSalarySurvey=; OB-USER-TOKEN=390dca4f-08d0-4f54-bce5-00e7e6aa3e39; LPVID=dkY2EwOTNmZTA4YTM1MDI1; …

Run Code Online (Sandbox Code Playgroud)

python selenium beautifulsoup web-scraping python-3.x

MIT*_*THU

2020 02-26

1
推荐指数

1
解决办法

282
查看次数