kee*_*aar 8 python extract scrapy web-scraping
我想从Tripadvisor.com网页上搜索/提取公司/酒店的网站.当我检查页面时,我没有看到网站的网址.关于如何使用python提取网站URL的任何想法?提前道歉,因为我最近刚刚开始"在Python中进行网络抓取".谢谢.
例如,请参见图中的两个红色箭头.当我选择网站链接时,它会将我带到' http://www.i-love-my-india.com/ ' - 这是我想用Python提取的内容.
小智 7
使用Selenium尝试这个:
import time
from selenium import webdriver
browser = webdriver.Firefox(executable_path="C:\\Users\\Vader\\geckodriver.exe")
# Must install geckodriver (handles your browser)- see instructions on
# http://selenium-python.readthedocs.io/installation.html.
# Change the path to where your geckodriver file is.
browser.get('https://www.tripadvisor.co.uk/Attraction_Review-g304551-d4590508-Reviews-Ashok_s_Taxi_Tours-New_Delhi_National_Capital_Territory_of_Delhi.html')
browser.find_element_by_css_selector('.blEntry.website').click()
#browser.window_handles # Results is 2 tabs opened.
browser.switch_to.window(browser.window_handles[1]) # changes the browser to
# the second one
time.sleep(1) # When I went directly I was getting a 'blank' result, so I put
# a little delay and it worked (I really do not know why).
res = browser.current_url # the URL
print(res)
browser.quit() # Closes the browser
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
2414 次 |
| 最近记录: |