如何在链接到事件点击时从网站上抓取数据?

kee*_*aar 8 python extract scrapy web-scraping

我想从Tripadvisor.com网页上搜索/提取公司/酒店的网站.当我检查页面时,我没有看到网站的网址.关于如何使用python提取网站URL的任何想法?提前道歉,因为我最近刚刚开始"在Python中进行网络抓取".谢谢.

例如,请参见图中的两个红色箭头.当我选择网站链接时,它会将我带到' http://www.i-love-my-india.com/ ' - 这是我想用Python提取的内容.

Tripadvisor网址 在此输入图像描述

小智 7

使用Selenium尝试这个:

import time
from selenium import webdriver

browser = webdriver.Firefox(executable_path="C:\\Users\\Vader\\geckodriver.exe")
# Must install geckodriver (handles your browser)- see instructions on
# http://selenium-python.readthedocs.io/installation.html.
# Change the path to where your geckodriver file is.

browser.get('https://www.tripadvisor.co.uk/Attraction_Review-g304551-d4590508-Reviews-Ashok_s_Taxi_Tours-New_Delhi_National_Capital_Territory_of_Delhi.html')
browser.find_element_by_css_selector('.blEntry.website').click()

#browser.window_handles # Results is 2 tabs opened. 

browser.switch_to.window(browser.window_handles[1]) # changes the browser to 
                                                    # the second one

time.sleep(1) # When I went directly I was getting a 'blank' result, so I put
              # a little delay and it worked (I really do not know why).

res = browser.current_url # the URL

print(res)

browser.quit() # Closes the browser
Run Code Online (Sandbox Code Playgroud)