我需要抓取需要登录的网站。我正在尝试创建一个session
并登录,因为我必须在登录后抓取不同的页面。但不知道为什么它不起作用。
import requests
from bs4 import BeautifulSoup
login_data = {
"log":"login",
"login":"my email",
"password":"my password"
}
session = requests.session()
session.post(login_url, data=login_data)
response = session.get(url)
html = response.text
soup = BeautifulSoup(html, "html.parser")
print(soup.title.get_text())
Run Code Online (Sandbox Code Playgroud)
标题显示它不起作用。
这是网站表格。
<form method="post" id="signin-form" class="form-horizontal">
<input type="hidden" name="referer" value="" />
<div class="form-group">
<label for="email_text" class="col-sm-4 control-label">Your login (email):</label>
<div class="col-sm-8">
<input type="email" class="form-control" id="email_text" value="" name="login" autofocus data-validation='{"parent":".form-group","events":["keyup","blur"],"rules":[{"name":"notblank"},{"name":"email"}]}' />
</div>
</div>
<div class="form-group">
<label for="password_text" class="col-sm-4 control-label">Password:</label>
<div class="col-sm-8">
<input type="password" class="form-control" id="password_text" name="password" data-validation='{"parent":".form-group","rules":[{"name":"min","min":5}]}' …
Run Code Online (Sandbox Code Playgroud)