下载网站中的所有文件

Question

下载网站中的所有文件

Bha*_*ath 9 python r download webclient-download

我需要下载此链接下的所有文件,其中只有郊区名称在每个链接中不断变化

只是一个参考 https://www.data.vic.gov.au/data/dataset/2014-town-and-community-profile-for-thornbury-suburb

此搜索链接下的所有文件:https: //www.data.vic.gov.au/data/dataset？q = 2014+town+and+community+profile

任何可能性？

谢谢 :)

Answer 1

nar*_*ren 14

你可以下载这样的文件

import urllib2
response = urllib2.urlopen('http://www.example.com/file_to_download')
html = response.read()

Run Code Online (Sandbox Code Playgroud)

获取页面中的所有链接

from bs4 import BeautifulSoup

import requests
r  = requests.get("http://site-to.crawl")
data = r.text
soup = BeautifulSoup(data)

for link in soup.find_all('a'):
    print(link.get('href'))

Run Code Online (Sandbox Code Playgroud)

归档时间：	8 年，4 月前
查看次数：	7400 次
最近记录：	8 年，4 月前