如果尚未下载,则从列表中下载文件

Question

如果尚未下载,则从列表中下载文件

我可以在c#中执行此操作,代码很长.

如果有人可以告诉我如何通过python完成这将是很酷的.

伪代码是:

url: www.example.com/somefolder/filename1.pdf

1. load file into an array (file contains a url on each line)
2. if file e.g. filename1.pdf doesn't exist, download file

Run Code Online (Sandbox Code Playgroud)

脚本可以采用以下布局:

/python-downloader/
/python-downloader/dl.py
/python-downloader/urls.txt
/python-downloader/downloaded/filename1.pdf

Run Code Online (Sandbox Code Playgroud)

Answer 1

Wol*_*lph 15

虽然我假设urls.txt文件只包含url ,但这应该可以解决问题.不是url:前缀.

import os
import urllib

DOWNLOADS_DIR = '/python-downloader/downloaded'

# For every line in the file
for url in open('urls.txt'):
    # Split on the rightmost / and take everything on the right side of that
    name = url.rsplit('/', 1)[-1]

    # Combine the name and the downloads directory to get the local filename
    filename = os.path.join(DOWNLOADS_DIR, name)

    # Download the file if it does not exist
    if not os.path.isfile(filename):
        urllib.urlretrieve(url, filename)

Run Code Online (Sandbox Code Playgroud)

而不是拆分'/',使用os.path.basename(url). (2认同)
对于 Python 3，将最后一行替换为： urllib.request.urlretrieve(url, filename) (2认同)

Answer 2

Bra*_*mos 8

这是对Python 3.3的WoLpH脚本的略微修改版本.

#!/usr/bin/python3.3
import os.path
import urllib.request

links = open('links.txt', 'r')
for link in links:
    link = link.strip()
    name = link.rsplit('/', 1)[-1]
    filename = os.path.join('downloads', name)

    if not os.path.isfile(filename):
        print('Downloading: ' + filename)
        try:
            urllib.request.urlretrieve(link, filename)
        except Exception as inst:
            print(inst)
            print('  Encountered unknown error. Continuing.')

Run Code Online (Sandbox Code Playgroud)

归档时间：	15 年，8 月前
查看次数：	10031 次
最近记录：	12 年，11 月前