我无法通过请求登录 Instagram

1 python instagram python-requests

我一直在尝试使用Requests库登录 Instagram,但无法正常工作。连接总是被拒绝。

import requests

#Creating URL, usr/pass and user agent variables

BASE_URL = 'https://www.instagram.com/'
LOGIN_URL = BASE_URL + 'accounts/login/ajax/'
USERNAME = '******'
PASSWD = '******'
USER_AGENT = 'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko)\
 Chrome/59.0.3071.115 Safari/537.36'

#Setting some headers and refers
session = requests.Session()
session.headers = {'user-agent': USER_AGENT}
session.headers.update({'Referer': BASE_URL})


try:
    #Requesting the base url. Grabbing and inserting the csrftoken

    req = session.get(BASE_URL)
    session.headers.update({'X-CSRFToken': req.cookies['csrftoken']})
    login_data = {'username': USERNAME, 'password': PASSWD}

    #Finally login in
    login = session.post(LOGIN_URL, data=login_data, allow_redirects=True)
    session.headers.update({'X-CSRFToken': login.cookies['csrftoken']})

    cookies = login.cookies

    #Print the html results after I've logged in
    print(login.text)

#In case of refused connection
except requests.exceptions.ConnectionError:
    print("Connection refused")
Run Code Online (Sandbox Code Playgroud)

我不知道我做错了什么。如果有人发布任何解决方案,我将不胜感激。请不要推荐API 或 Selenium(目前它们不是我的选择)

小智 9

由于请求不执行 JavaScript,因此您的 cookie 中没有 CSRFToken。

如果您查看内容,您可以在 html 中找到 csrf_token。

使用 bs4 和 json 您可以提取它并在您的帖子中使用它。

from bs4 import BeautifulSoup
import json, random, re, requests

BASE_URL = 'https://www.instagram.com/accounts/login/'
LOGIN_URL = BASE_URL + 'ajax/'

headers_list = [
        "Mozilla/5.0 (Windows NT 5.1; rv:41.0) Gecko/20100101"\
        " Firefox/41.0",
        "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_2)"\
        " AppleWebKit/601.3.9 (KHTML, like Gecko) Version/9.0.2"\
        " Safari/601.3.9",
        "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:15.0)"\
        " Gecko/20100101 Firefox/15.0.1",
        "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36"\
        " (KHTML, like Gecko) Chrome/42.0.2311.135 Safari/537.36"\
        " Edge/12.246"
        ]


USERNAME = '****'
PASSWD = '*****'
USER_AGENT = headers_list[random.randrange(0,4)]

session = requests.Session()
session.headers = {'user-agent': USER_AGENT}
session.headers.update({'Referer': BASE_URL})    
req = session.get(BASE_URL)    
soup = BeautifulSoup(req.content, 'html.parser')    
body = soup.find('body')

pattern = re.compile('window._sharedData')
script = body.find("script", text=pattern)

script = script.get_text().replace('window._sharedData = ', '')[:-1]
data = json.loads(script)

csrf = data['config'].get('csrf_token')
login_data = {'username': USERNAME, 'password': PASSWD}
session.headers.update({'X-CSRFToken': csrf})
login = session.post(LOGIN_URL, data=login_data, allow_redirects=True)
login.content
# b'{"authenticated": true, "user": true, "userId": "*******", "oneTapPrompt": false, "status": "ok"}'
Run Code Online (Sandbox Code Playgroud)

请记住,instagram 中的大部分数据都是用 javascript 加载的,因此您将来可能会遇到更多麻烦。

您可以参考这篇关于如何恢复数据的帖子:https : //stackoverflow.com/a/49831347

或者您可以使用不同的库,如 dryscrape 或 spynner