我必须在这里找到一些不匹配的版本,因为我无法使用Python启动Selenium来启动Firefox Web浏览器.我使用的是旧版本的Firefox,因为这里的其他人拥有相同的旧版本的Python,对于他们来说,旧版本的Firefox效果最好.
码:
from selenium import webdriver
from selenium import common
from selenium.webdriver import ActionChains
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.common.exceptions import TimeoutException
from selenium.common.exceptions import NoSuchElementException
from selenium.webdriver.common.desired_capabilities import DesiredCapabilities
driver=webdriver.Firefox(capabilities=DesiredCapabilities.FIREFOX)
Run Code Online (Sandbox Code Playgroud)
错误:
Traceback (most recent call last):
File "scrapeCommunitySelenium.py", line 13, in <module>
driver=webdriver.Firefox(capabilities=DesiredCapabilities.FIREFOX)
File "/Library/Python/2.7/site-packages/selenium/webdriver/firefox/webdriver.py", line 158, in __init__
keep_alive=True)
File "/Library/Python/2.7/site-packages/selenium/webdriver/remote/webdriver.py", line 154, in __init__
self.start_session(desired_capabilities, browser_profile)
File "/Library/Python/2.7/site-packages/selenium/webdriver/remote/webdriver.py", line 243, in start_session
response = self.execute(Command.NEW_SESSION, parameters) …Run Code Online (Sandbox Code Playgroud) 美好的一天,我已经在这里和谷歌进行了一些搜索,但还没有找到解决这个问题的解决方案.
场景是:
我有一个Python脚本(2.7)循环遍历许多URL(例如,想想亚马逊页面,抓取评论).每个页面都有相同的HTML布局,只是抓取不同的信息.我使用Selenium和无头浏览器,因为这些页面需要执行javascript以获取信息.
我在本地计算机上运行此脚本(OSX 10.10).Firefox是最新的v59.Selenium版本为3.11.0,使用geckodriver v0.20.
这个脚本在本地没有问题,它可以运行所有的URL并刮除页面没有问题.
现在,当我将脚本放在我的服务器上时,唯一的区别是它是Ubuntu 16.04(32位).我使用适当的geckodriver(仍然是v0.20),但其他一切都是相同的(Python 2.7,Selenium 3.11).它似乎随机崩溃无头浏览器,然后所有browserObjt.get('url...')不再工作.
错误消息说:
消息:无法解读牵线木偶的响应
任何进一步的页面selenium请求都会返回错误:
消息:尝试在不建立连接的情况下运行命令
要显示一些代码:
当我创建驱动程序时:
options = Options()
options.set_headless(headless=True)
driver = webdriver.Firefox(
firefox_options=options,
executable_path=config.GECKODRIVER
)
Run Code Online (Sandbox Code Playgroud)
driver作为参数传递给脚本的函数,browserObj然后用于调用特定页面,然后一旦加载它就传递给BeautifulSoup进行解析:
browserObj.get(url)
soup = BeautifulSoup(browserObj.page_source, 'lxml')
Run Code Online (Sandbox Code Playgroud)
该错误可能指向正在崩溃浏览器的BeautifulSoup行.
可能导致此问题的原因,我该怎么做才能解决问题?
编辑:添加指向同一事物的堆栈跟踪:
Traceback (most recent call last):
File "main.py", line 164, in <module>
getLeague
File "/home/ps/dataparsing/XXX/yyy.py", line 48, in BBB
soup = BeautifulSoup(browserObj.page_source, 'lxml')
File "/home/ps/AAA/projenv/local/lib/python2.7/site-packages/selenium/webdriver/remote/webdriver.py", line 670, in page_source
return self.execute(Command.GET_PAGE_SOURCE)['value']
File "/home/ps/AAA/projenv/local/lib/python2.7/site-packages/selenium/webdriver/remote/webdriver.py", line 312, in execute
self.error_handler.check_response(response)
File "/home/ps/AAA/projenv/local/lib/python2.7/site-packages/selenium/webdriver/remote/errorhandler.py", …Run Code Online (Sandbox Code Playgroud) 有没有办法使用geckodriver使您的 Selenium 脚本在 Python 中无法检测到?
我正在使用 Selenium 进行抓取。我们是否需要使用任何保护措施使网站无法检测到 Selenium?
出于某种原因,只有在打开嵌套webdriver实例时才会出现以下错误.不知道这里发生了什么.
我使用的是Windows 10, geckodriver 0.21.0和Python 3.7.
ConnectionAbortedError:[WinError 10053]
An established connection was aborted by the software in your host machine
Run Code Online (Sandbox Code Playgroud)
工作正常的脚本的一部分
tab_backers = ff.find_element_by_xpath('//a[@gogo-test="backers_tab"]')
try:
funding_backers_count = int(''.join(filter(str.isdigit, str(tab_backers.text))))
except ValueError:
funding_backers_count = 0
if funding_backers_count > 0:
tab_backers.click()
see_more_backers = WebDriverWait(ff, 10).until(
EC.element_to_be_clickable((By.XPATH, '//ui-view//a[text()="See More Backers"]'))
)
clicks = 0
while clicks < 0:
clicks += 1
ff.WebDriverWait(ff, 5).until(
see_more_backers.click()
)
for container in ff.find_elements_by_xpath('//ui-view//div[@class="campaignBackers-pledge ng-scope"]'):
backers_profile = container.find_elements_by_xpath('./*/div[@class="campaignBackers-pledge-backer-details"]/a')
if …Run Code Online (Sandbox Code Playgroud) 我遇到了Actions班级司机的问题.我有这段代码
Actions act= new Actions(d1);
act.moveToElement(d1.findElement(By.xpath("path of the element")).build().perform();
Run Code Online (Sandbox Code Playgroud)
以前当我使用时Selenium-Java 2.43.0,此命令工作正常.但我升级到3.0.0-beta2,开始firefox driver通过壁虎驱动程序访问.
在上面指定的命令中,我的测试失败了.我得到以下例外
org.openqa.selenium.UnsupportedCommandException:POST/session/21dfc828-a382-4622-8c61-89bc48e29744/moveto与已知命令不匹配(警告:服务器未提供任何堆栈跟踪信息)命令持续时间或超时:4毫秒
请帮忙
Selenium 3.0 Firefx驱动程序因org.openqa.selenium.SessionNotCreatedException而失败:无法创建新的远程会话.
System.setProperty("webdriver.gecko.driver", "..<Path>../geckodriver.exe");
capabilities = DesiredCapabilities.firefox();
capabilities.setCapability("marionette", true);
driver = new FirefoxDriver(capabilities);
Caused by: org.openqa.selenium.SessionNotCreatedException: Unable to create new remote session. desired capabilities = Capabilities [{marionette=true, firefoxOptions=org.openqa.selenium.firefox.FirefoxOptions@23aa363a, browserName=firefox, moz:firefoxOptions=org.openqa.selenium.firefox.FirefoxOptions@23aa363a, version=, platform=ANY}], required capabilities = Capabilities [{}]
Build info: version: '3.0.0', revision: '350cf60', time: '2016-10-13 10:48:57 -0700'
System info: host: 'D202540', ip: '10.22.19.193', os.name: 'Windows 7', os.arch: 'amd64', os.version: '6.1', java.version: '1.8.0_45'
Driver info: driver.version: FirefoxDriver
at org.openqa.selenium.remote.ProtocolHandshake.createSession(ProtocolHandshake.java:91)
at org.openqa.selenium.remote.HttpCommandExecutor.execute(HttpCommandExecutor.java:141)
at org.openqa.selenium.remote.service.DriverCommandExecutor.execute(DriverCommandExecutor.java:82)
at org.openqa.selenium.remote.RemoteWebDriver.execute(RemoteWebDriver.java:601)
at org.openqa.selenium.remote.RemoteWebDriver.startSession(RemoteWebDriver.java:241)
at org.openqa.selenium.remote.RemoteWebDriver.<init>(RemoteWebDriver.java:128)
at …Run Code Online (Sandbox Code Playgroud) 我在用:
我试过了
FirefoxProfile profile = new FirefoxProfile();
profile.setPreference("webdriver.log.browser.ignore", true);
profile.setPreference("webdriver.log.driver.ignore", true);
profile.setPreference("webdriver.log.profiler.ignore", true);
FirefoxDriver driver = new FirefoxDriver();
Run Code Online (Sandbox Code Playgroud)
和
LoggingPreferences preferences = new LoggingPreferences();
preferences.enable(LogType.BROWSER, Level.OFF);
preferences.enable(LogType.CLIENT, Level.OFF);
preferences.enable(LogType.DRIVER, Level.OFF);
preferences.enable(LogType.PERFORMANCE, Level.OFF);
preferences.enable(LogType.SERVER, Level.OFF);
DesiredCapabilities capabilities = DesiredCapabilities.firefox();
capabilities.setCapability(CapabilityType.LOGGING_PREFS, preferences);
FirefoxDriver driver = new FirefoxDriver(capabilities);
Run Code Online (Sandbox Code Playgroud)
这些方法都没有做任何事情来阻止记录.如果这有助于某种方式,这是控制台输出:
对于那些想知道的人,我有log4j 1.2.17在我的pom.xml但没有log4j.properties或log4j.xml我根本不使用它.
澄清一下:当我说日志记录时,我指的是IntelliJ IDEA中的控制台输出.我正在使用Java.
我需要关闭Marionette/GeckoDriver日志记录; 有没有办法做到这一点?我一直在寻找,但我没有得到正确的答案.INFO日志是:
1484653905833 geckodriver INFO Listening on 127.0.0.1:15106
Jan 17, 2017 5:21:46 PM org.openqa.selenium.remote.ProtocolHandshake createSession
INFO: Attempting bi-dialect session, assuming Postel's Law holds true on the remote end
1484653906715 mozprofile::profile INFO Using profile path C:\Users\vtiger\AppData\Local\Temp\3\rust_mozprofile.7d2LEwDKoE8J
1484653906720 geckodriver::marionette INFO Starting browser C:\Program Files\Mozilla Firefox\firefox.exe
1484653906731 geckodriver::marionette INFO Connecting to Marionette on localhost:58602
1484653908388 addons.manager DEBUG Application has been upgraded
1484653908843 addons.manager DEBUG Loaded provider scope for resource://gre/modules/addons/XPIProvider.jsm: ["XPIProvider"]
1484653908846 addons.manager DEBUG Loaded provider scope for resource://gre/modules/LightweightThemeManager.jsm: ["LightweightThemeManager"]
1484653908852 addons.manager DEBUG Loaded …Run Code Online (Sandbox Code Playgroud) 我在OS Sierra上,我正在运行Python 3.5.2.我安装了selenium,我正在关注一本名为"使用Python自动执行无聊任务"的书
我的代码是
from selenium import webdriver
>>> browser = webdriver.Firefox()
Run Code Online (Sandbox Code Playgroud)
我一直收到错误
Traceback (most recent call last):
File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/selenium/webdriver/common/service.py", line 64, in start
stdout=self.log_file, stderr=self.log_file)
File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/subprocess.py", line 947, in __init__
restore_signals, start_new_session)
File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/subprocess.py", line 1551, in _execute_child
raise child_exception_type(errno_num, err_msg)
FileNotFoundError: [Errno 2] No such file or directory: 'geckodriver'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "<pyshell#1>", line 1, in <module>
browser = webdriver.Firefox()
File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/selenium/webdriver/firefox/webdriver.py", line 135, in …Run Code Online (Sandbox Code Playgroud) 我不完全理解geckodriver和牵线木偶之间的区别.
例如,当我使用Selenium WebDriver来控制Firefox浏览器时,我需要一个geckodriver二进制文件来监听Selenium 的WebDriver协议.
geckodriver ×10
selenium ×8
firefox ×6
python ×5
java ×2
action ×1
gecko ×1
logging ×1
python-3.x ×1
web-scraping ×1