小编Bry*_*Bry的帖子

获取一个网站的所有链接

嗨，我想创建一个迷你爬虫，但不使用Scrapy，

我创建了这样的东西：

response = requests.get(url)
homepage_link_list = []
soup = BeautifulSoup(response.content, 'lxml')
for link in soup.findAll("a"):
    if link.get("href"):
        homepage_link_list.append(link.get("href"))


link_list = []
for item in homepage_link_list:
    response = requests.get(item)
    soup = BeautifulSoup(response.content, 'lxml')
    for link in soup.findAll("a"):
        if link.get("href"):
            link_list.append(link.get("href"))

Run Code Online (Sandbox Code Playgroud)

虽然我遇到的问题是它只获取网页链接中的链接，但我怎样才能让它获取网站所有链接中的所有链接。

python beautifulsoup web-scraping python-requests

Bry*_*Bry

lucky-day

1
推荐指数

1
解决办法

4893
查看次数