相关疑难解决方法(0)

如何使用Wget从URL下载所有图像到单个文件夹？

我使用wget从网站下载所有图像,它工作正常但它存储了所有子文件夹的网站的原始层次结构,因此图像点缀.有没有办法让它将所有图像下载到一个文件夹中？我目前使用的语法是:

wget -r -A jpeg,jpg,bmp,gif,png http://www.somedomain.com

Run Code Online (Sandbox Code Playgroud)

wget

geo*_*310

2018 03-04

129
推荐指数

5
解决办法

24万
查看次数

使用 BeautifulSoup 将网站上的所有图像下载到指定文件夹的 Python 脚本

我找到了这篇文章并想稍微修改脚本以将图像下载到特定文件夹。我编辑的文件如下所示：

import re
import requests
from bs4 import BeautifulSoup
import os

site = 'http://pixabay.com'
directory = "pixabay/" #Relative to script location

response = requests.get(site)

soup = BeautifulSoup(response.text, 'html.parser')
img_tags = soup.find_all('img')

urls = [img['src'] for img in img_tags]

for url in urls:
    #print(url)
    filename = re.search(r'/([\w_-]+[.](jpg|gif|png))$', url)

    with open(os.path.join(directory, filename.group(1)), 'wb') as f:
        if 'http' not in url:
            url = '{}{}'.format(site, url)
        response = requests.get(url)
        f.write(response.content)

Run Code Online (Sandbox Code Playgroud)

这对于pixabay似乎工作正常，但如果我尝试不同的网站，如imgur或heroimages，它似乎不起作用。如果我用

site = 'http://heroimages.com/portfolio'

Run Code Online (Sandbox Code Playgroud)

没有下载任何东西。打印语句（未注释时）不打印任何内容，所以我猜它没有找到任何图像标签？我不知道。

另一方面，如果我用 …

python image beautifulsoup request

Toj*_*j19

2018 06-28

3
推荐指数

1
解决办法

8533
查看次数

标签统计

beautifulsoup ×1

image ×1

python ×1

request ×1

wget ×1

如何使用Wget从URL下载所有图像到单个文件夹？

使用 BeautifulSoup 将网站上的所有图像下载到指定文件夹的 Python 脚本

标签 统计

标签统计