如何获取使用 JavaScript 构建的锚标记的重定向链接?

Hoo*_*ini 5 python scrapy python-requests

我有一个重定向到外部网站的链接...我想知道该链接重定向到的最终 URL 是什么。我试过:

requests.get("link.which.redirects.and.has.dynamic.js.code.com")
Run Code Online (Sandbox Code Playgroud)

但是我无法获得最终重定向的 URL,因为它是动态构建的......我不确定到底会发生什么,但是页面加载涉及一些 JavaScript 代码,最终结果是重定向到外部页面。

所以相反,我尝试了SeleniumChromeDriverManager

from selenium import webdriver
from webdriver_manager.chrome import ChromeDriverManager

class MySpider(scrapy.Spider):
    name = 'my_spider'

    def __init__(self):
        self.driver = webdriver.Chrome(ChromeDriverManager().install())

    def parse(self, response):
        link = "link.which.redirects.and.has.dynamic.js.code.com"
        self.driver.get(link)
        time.sleep(1) # without this wait, driver.current_url is not the final redirect
        url = self.driver.current_url
Run Code Online (Sandbox Code Playgroud)

上面的代码加载了整个页面,为了获取重定向URL,有没有更高效的获取重定向URL的方法?

Abh*_*wal 0

要找出最终结果,您可以使用以下代码:

import time
import requests

website = requests.get("link.which.redirects.and.has.dynamic.js.code.com", time.sleep(5))

print(website.url) # To see final URL
print(website.history) # To see from where it was the redirection codes
print(website.is_redirect) # Was it redirected 
print(website.is_permanent_redirect) # Is it permanently redirected 
Run Code Online (Sandbox Code Playgroud)