我正在尝试使用Python和请求库登录网站进行一些抓取,我正在尝试以下(这不起作用):
import requests
headers = {'User-Agent': 'Mozilla/5.0'}
payload = {'username':'niceusername','password':'123456'}
In [12]: r = requests.post('https://admin.example.com/login.php',headers=headers,data=payload)
Run Code Online (Sandbox Code Playgroud)
但是nada,重定向到登录页面.我需要开会吗?我做错了POST请求,是否需要加载cookie?会话会自动执行吗?我迷失在这里,需要一些帮助和解释.
我正在尝试登录的网站是php,我是否需要"捕获set-cookie并设置cookie标头"?如果是这样,我不知道该怎么做.该网页是一个包含以下内容的表单(如果有帮助):输入:用户名''密码''id':'myform','action':"login.php
一些额外的信息,也许你可以看到我在这里失踪的..
In [13]: r.headers
Out[13]: CaseInsensitiveDict({'content-encoding': 'gzip', 'transfer-encoding': 'chunked',
'set-cookie': 'PHPSESSID=v233mnt4malhed55lrpc5bp8o1; path=/',
'expires': 'Thu, 19 Nov 1981 08:52:00 GMT', 'vary': 'Accept-Encoding', 'server': 'nginx',
'connection': 'keep-alive', 'pragma': 'no-cache',
'cache-control': 'no-store, no-cache, must-revalidate, post-check=0, pre-check=0',
'date': 'Tue, 24 Dec 2013 10:50:44 GMT', 'content-type': 'text/html'})
In [14]: r.cookies
Out[14]: <<class 'requests.cookies.RequestsCookieJar'>[Cookie(version=0, name='PHPSESSID',
value='v233mnt4malhed55lrpc5bp8o1', port=None, port_specified=False, domain='admin.example.com',
domain_specified=False, domain_initial_dot=False, path='/', path_specified=True, secure=False,
expires=None, discard=True, comment=None, comment_url=None, rest={}, …Run Code Online (Sandbox Code Playgroud) 尝试使用Requests会话发出一个简单的get请求,但我一直在为特定站点获取SSLerror.我想也许问题出在网站上(我使用https://www.ssllabs.com进行了扫描,结果如下),但我不能确定,因为我对此领域一无所知:)我肯定会喜欢了解发生了什么.
解决方案/解释会很棒,谢谢!
代码:
import requests
requests.get('https://www.reporo.com/')
Run Code Online (Sandbox Code Playgroud)
我收到了下一个错误:
SSLError: [Errno bad handshake] [('SSL routines', 'SSL3_GET_SERVER_CERTIFICATE', 'certificate verify failed')]
---------------------------------------------------------------------------
SSLError Traceback (most recent call last)
<ipython-input-7-cfc21b287fee> in <module>()
----> 1 requests.get('https://www.reporo.com/')
/usr/local/lib/python2.7/dist-packages/requests/api.pyc in get(url, **kwargs)
63
64 kwargs.setdefault('allow_redirects', True)
---> 65 return request('get', url, **kwargs)
66
67
/usr/local/lib/python2.7/dist-packages/requests/api.pyc in request(method, url, **kwargs)
47
48 session = sessions.Session()
---> 49 response = session.request(method=method, url=url, **kwargs)
50 # By explicitly closing the session, we avoid leaving sockets open which
51 # …Run Code Online (Sandbox Code Playgroud) 我正在尝试使用selenium 更改元素的CSS样式(例如:from "visibility: hidden;"to "visibility: visible;").execute_script.(优雅地接受通过selenium + python的任何其他方法).
我的代码:
driver = webdriver.Firefox()
driver.get("http://www.example.com")
elem = driver.find_element_by_id('copy_link')
elem.execute_script( area of my problem )
Run Code Online (Sandbox Code Playgroud)
为了玩网页的CSS,我需要做些什么?
我一直在尝试最后一小时删除一个元素没有任何成功.元素只能通过类名来访问.我试过了:
js = "var aa=document.getElementsByClassName('classname')[0];aa.parentNode.removeChild(aa)"
driver.execute_script(js)
Run Code Online (Sandbox Code Playgroud)
我得到的错误是parentNode未定义.
那么使用Selenium删除元素的最佳方法是什么?
我已经在stackoverflow中读取了大部分python/cron,但却无法运行我的脚本.我明白我需要通过shell运行我的脚本(顺便使用zsh和ipython),但我真的不知道该怎么做:/
我的简单代码:
在crontab-
*/1 * * * * ipython /home/usr/Data/progs/cron_test.py
Run Code Online (Sandbox Code Playgroud)
我的python脚本 -
import pickle
from selenium import webdriver
driver = webdriver.Firefox()
driver.get('http://www.google.com')
t=driver.current_url
pickle.dump(t,open('noreal','wb'))
Run Code Online (Sandbox Code Playgroud)
我已经尝试过一些东西,但无济于事:
#!python ../python etc
SHELL = /usr/bin/zsh
PATH =/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin
Run Code Online (Sandbox Code Playgroud)
仅仅因为我不知道问题是什么,我假设通过解释器运行python脚本,但我不知道我在做什么:)
一个工作的解决方案会很棒,我真的很感激为任何解决方案提供解释,这样我就能理解为什么以及如何而不仅仅是'它的工作原理!谢谢!再见!"
谢谢你的帮助!
更新 到目前为止,我已经缩小了问题范围,现在python运行时使用以下设置:
*/3 * * * * /usr/local/bin/ipython /home/user/Data/progs/RF/cron_test.py
Run Code Online (Sandbox Code Playgroud)
我有一个追溯:
[1;31m---------------------------------------------------------------------------[0m
[1;31mWebDriverException[0m Traceback (most recent call last)
[1;32m/usr/local/lib/python2.7/dist-packages/IPython/utils/py3compat.pyc[0m in [0;36mexecfile[1;34m(fname, *where)[0m
[0;32m 176[0m [1;32melse[0m[1;33m:[0m[1;33m[0m[0m
[0;32m 177[0m [0mfilename[0m [1;33m=[0m [0mfname[0m[1;33m[0m[0m
[1;32m--> 178[1;33m [0m__builtin__[0m[1;33m.[0m[0mexecfile[0m[1;33m([0m[0mfilename[0m[1;33m,[0m [1;33m*[0m[0mwhere[0m[1;33m)[0m[1;33m[0m[0m
[0m
[1;32m/home/user/Data/progs/FB/cron_test.py[0m in [0;36m<module>[1;34m()[0m
[0;32m 9[0m [1;33m[0m[0m
[0;32m 10[0m [1;33m[0m[0m
[1;32m---> …Run Code Online (Sandbox Code Playgroud) 我正在尝试使用Selenium docker运行我的测试,我在端口9000上运行了一个本地grunt服务器,我启动了以下selenium docker:
docker run -d -p 4444:4444 -p 5900:5900 selenium/standalone-chrome-debug
Run Code Online (Sandbox Code Playgroud)
然后我启动了我的测试(使用Capybara)并打开VNC观看测试,但我得到的只是chrome messgae"这个网站无法到达".
cabybara.rb:
isWindows = (/cygwin|mswin|mingw|bccwin|wince|emx/ =~ RUBY_PLATFORM) != nil
require 'capybara/rspec'
require 'capybara'
require 'capybara/dsl'
require_relative 'sinatra_proxy'
require 'selenium/webdriver'
require 'selenium/webdriver/remote/http/curb' if !isWindows
Capybara.register_driver :selenium_chrome do |app|
http_client = isWindows ? nil : Selenium::WebDriver::Remote::Http::Curb.new
options = {
http_client: http_client,
browser: :chrome,
# service_log_path: 'chromedriver.out', # Enable Selenium logs
switches: ["--disable-web-security", '--user-agent="Chrome under Selenium for Capybara"']
}
options[:url] = "http://172.17.0.2:4444/wd/hub"
Capybara::Selenium::Driver.new app, options
end
Capybara.default_driver = :selenium_chrome
Capybara.app = …Run Code Online (Sandbox Code Playgroud) 我通常从IntelliJ内部运行我的规范.我删除了我的宝石,并使用捆绑安装重新安装它们(由于另一个错误),现在我在尝试运行规范时遇到错误.
我注意到它使用的IntelliJ运行规范:
from /home/user/.rvm/rubies/ruby-2.2.4/lib/ruby/site_ruby/2.2.0/rubygems/core_ext/kernel_require.rb:54:in `require'
Run Code Online (Sandbox Code Playgroud)
从shell运行时不是这种情况(我在kernel_require脚本中放置了一个打印检查它).
另外我看到来自intelliJ的ruby版本是:
"ruby 2.2.4: 230"
Run Code Online (Sandbox Code Playgroud)
从壳:
ruby -e 'print "ruby #{ RUBY_VERSION }p#{ RUBY_PATCHLEVEL }"'
ruby 2.2.6p396%
Run Code Online (Sandbox Code Playgroud)
错误:
/home/user/.rvm/rubies/ruby-2.2.4/bin/ruby -e $stdout.sync=true;$stderr.sync=true;load($0=ARGV.shift) /home/user/.rvm/gems/ruby-2.2.4/bin/rspec /home/user/workspace/auto-test/spec/pools/pool_cg_view_spec.rb --require teamcity/spec/runner/formatter/teamcity/formatter --format Spec::Runner::Formatter::TeamcityFormatter
Testing started at 10:21 ...
/home/user/.rvm/rubies/ruby-2.2.4/lib/ruby/site_ruby/2.2.0/rubygems/core_ext/kernel_require.rb:54:in `require': incompatible library version - /home/user/.rvm/gems/ruby-2.2.4/gems/nokogiri-1.6.8/lib/nokogiri/nokogiri.so (LoadError)
from /home/user/.rvm/rubies/ruby-2.2.4/lib/ruby/site_ruby/2.2.0/rubygems/core_ext/kernel_require.rb:54:in `require'
from /home/user/.rvm/gems/ruby-2.2.4/gems/nokogiri-1.6.8/lib/nokogiri.rb:32:in `rescue in <top (required)>'
from /home/user/.rvm/gems/ruby-2.2.4/gems/nokogiri-1.6.8/lib/nokogiri.rb:28:in `<top (required)>'
from /home/user/.rvm/rubies/ruby-2.2.4/lib/ruby/site_ruby/2.2.0/rubygems/core_ext/kernel_require.rb:54:in `require'
from /home/user/.rvm/rubies/ruby-2.2.4/lib/ruby/site_ruby/2.2.0/rubygems/core_ext/kernel_require.rb:54:in `require'
from /home/user/.rvm/gems/ruby-2.2.4/gems/capybara-2.7.1/lib/capybara.rb:3:in `<top (required)>'
from /home/user/.rvm/rubies/ruby-2.2.4/lib/ruby/site_ruby/2.2.0/rubygems/core_ext/kernel_require.rb:54:in `require'
from /home/user/.rvm/rubies/ruby-2.2.4/lib/ruby/site_ruby/2.2.0/rubygems/core_ext/kernel_require.rb:54:in `require'
from /home/user/.rvm/gems/ruby-2.2.4/gems/capybara-2.7.1/lib/capybara/dsl.rb:2:in `<top (required)>'
from /home/user/.rvm/rubies/ruby-2.2.4/lib/ruby/site_ruby/2.2.0/rubygems/core_ext/kernel_require.rb:54:in …Run Code Online (Sandbox Code Playgroud) 我正在尝试收集 Chrome 浏览器日志:浏览器发出的警告,例如弃用和干预。例如,对于网站https://uriyaa.wixsite.com/corvid-cli2:
A cookie associated with a cross-site resource at http://wix.com/ was set without the `SameSite` attribute.
A future release of Chrome will only deliver cookies with cross-site requests if they are set with `SameSite=None` and `Secure`.
You can review cookies in developer tools under Application>Storage>Cookies and see more details at https://www.chromestatus.com/feature/5088147346030592 and https://www.chromestatus.com/feature/5633521622188032.
Run Code Online (Sandbox Code Playgroud)
我认为下面的代码可以解决问题,但它只捕获页面代码生成的日志。
(async ()=> {
const browser = await puppeteer.launch({dumpio: true});
const page = await browser.newPage();
page.on('console', msg => {
for (let i = …Run Code Online (Sandbox Code Playgroud) 我试图从失败的请求和js错误中收集数据.
我正在使用以下网站:https://nitzani1.wixsite.com/marketing-automation/3rd-page
该网站要求https://api.fixer.io/1latest,其返回状态代码404,
该页面还包含以下js错误:
"Uncaught (in promise) Fetch did not succeed"
Run Code Online (Sandbox Code Playgroud)
我试图编写下面的代码以捕获404和js错误,但不能.不确定我做错了什么,有什么想法如何解决它?
const puppeteer = require('puppeteer');
function wait (ms) {
return new Promise(resolve => setTimeout(() => resolve(), ms));
}
var run = async () => {
const browser = await puppeteer.launch({
headless: false,
args: ['--start-fullscreen']
});
page = await browser.newPage();
page.on('error', err=> {
console.log('err: '+err);
});
page.on('pageerror', pageerr=> {
console.log('pageerr: '+pageerr);
});
page.on('requestfailed', err => console.log('requestfailed: '+err));
collectResponse = [];
await page.on('requestfailed', rf => { …Run Code Online (Sandbox Code Playgroud) 尝试使用 gitlab-runner 推送到 Gitlab 注册表时出现以下错误:
\n\nunauthorized: authentication required \nERROR: Build failed: exit status 1\nRun Code Online (Sandbox Code Playgroud)\n\n虽然:
\n\n$ docker login -u gitlab-ci-token -p $CI_BUILD_TOKEN git.COMPANY.com\nLogin Succeeded\nRun Code Online (Sandbox Code Playgroud)\n\n从我的本地环境进行构建和推送效果很好,这表明问题与运行程序运行的主机(gitlab-ci3)有关,或者可能与正在使用的用户有关:
\n\n$ echo $USER\ngitlab-runner\nRun Code Online (Sandbox Code Playgroud)\n\n在组中:
\n\ndocker:x:999:gitlab-runner\ngitlab-runner:x:998:\nRun Code Online (Sandbox Code Playgroud)\n\n我已经尝试过docker 未经授权:需要身份验证 - 成功登录后推送但没有成功。\n也许 gitlab-runner 没有 root config.json 的权限是原因?:
\n\n$ cat /root/.docker/config.json\n cat: /root/.docker/config.json: Permission denied\nRun Code Online (Sandbox Code Playgroud)\n\n除了解决这个问题之外,如果您能为我提供如何更好地调试此错误以供将来使用,这将非常有帮助。
\n\n我正在使用 GitLab 企业版 8.13.1-ee、Docker 1.12.3、gitlab-ci-multi-runner 1.7.1
\n\n亚特实验室输出:
\n\nRunning with gitlab-ci-multi-runner 1.7.1 (f896af7)\nUsing Shell executor...\nRunning on gitlab-ci3...\nFetching changes...\nHEAD is …Run Code Online (Sandbox Code Playgroud)