Nem*_* Ga 10 python phantomjs selenium-webdriver
当使用带有phantomjs的代理时,它使用默认的python用户代理.
在ubuntu 14.04上运行:Python 3.5.1
service_args = []
if self.proxy:
service_args.extend([
'--proxy={}:{}'.format(self.proxy.host, self.proxy.port),
'--proxy-type={}'.format(self.proxy.proto),
])
if self.proxy.username and self.proxy.password:
service_args.append(
'--proxy-auth={}:{}'.format(self.proxy.username, self.proxy.password)
)
dcap = dict(DesiredCapabilities.PHANTOMJS)
dcap["phantomjs.page.settings.userAgent"] = (
"Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/53 "
"(KHTML, like Gecko) Chrome/15.0.87"
)
self.webdriver = webdriver.PhantomJS(service_args=service_args, desired_capabilities=dcap)
Run Code Online (Sandbox Code Playgroud)
并且错误:
消息:错误消息=>'无法找到带有css选择器的元素'#navcnt td.cur''由Request => {"headers"引起:{"接受":"application/json","Accept-Encoding":"同一性", "连接": "关闭", "内容长度": "105", "内容类型": "应用/ JSON;字符集= UTF-8", "主机": "127.0.0.1:39281" ,"User-Agent":"Python-urllib/3.5" } ...
在类似的问题中得出结论,问题是由代理提供商通过在服务器级别设置用户代理引起的,但是我怀疑这是因为我可以使用带有chrome的代理修改它.
这对我有用:
就我而言,我仔细研究了 PhantomJS 驱动程序的功能:
dcap = dict(DesiredCapabilities.PHANTOMJS)
dcap["phantomjs.page.settings.userAgent"] = "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/53 (KHTML, like Gecko) Chrome/15.0.87"
service_args = [
'--proxy=5.135.176.41:3123',
'--proxy-type=http',
]
phantom = webdriver.PhantomJS(js_path, desired_capabilities=dcap, service_args =service_args)
print(phantom.capabilities)
Run Code Online (Sandbox Code Playgroud)
输出是:
{'databaseEnabled': False, 'handlesAlerts': False, 'rotatable': False, 'browserConnectionEnabled': False, 'browserName': 'phantomjs', 'takesScreenshot': True, 'nativeEvents': True, 'locationContextEnabled': False, 'phantomjs.page.settings.userAgent': 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/53 (KHTML, like Gecko) Chrome/15.0.87', 'platform': 'linux-unknown-64bit', 'version': '2.1.1', 'applicationCacheEnabled': False, 'driverName': 'ghostdriver', 'webStorageEnabled': False, 'javascriptEnabled': True, 'cssSelectorsEnabled': True, 'proxy': {'proxyType': 'direct'}, 'acceptSslCerts': False, 'driverVersion': '1.2.0'}
Run Code Online (Sandbox Code Playgroud)
这意味着 userAgent 实际上已正确设置 ( 'phantomjs.page.settings.userAgent': 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/53 (KHTML, like Gecko) Chrome/15.0.87'),但不知何故它没有采用我使用 service-args 设置的代理。不过,像这样手动操作这些功能效果非常好:
dcap = dict(DesiredCapabilities.PHANTOMJS)
dcap["phantomjs.page.settings.userAgent"] = "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/53 (KHTML, like Gecko) Chrome/15.0.87"
phantom = webdriver.PhantomJS(js_path, desired_capabilities=dcap)
phantom.capabilities["acceptSslCerts"] = True
phantom.capabilities["proxy"] = {"proxy": "5.135.176.41:3123",
"proxy-type": "http"}
max_wait = 20
phantom.set_window_size(1024, 768)
phantom.set_page_load_timeout(max_wait)
phantom.set_script_timeout(max_wait)
phantom.get(url)
Run Code Online (Sandbox Code Playgroud)
谢谢你提出这个问题,我实际上已经研究 PhantomJS 代理很长一段时间了,这个问题让我走上了正确的道路。我希望这有帮助!
| 归档时间: |
|
| 查看次数: |
1322 次 |
| 最近记录: |