Python urllib下载在线目录的内容

dav*_*upt 7 python directory urllib urllib2 python-3.x

我正在尝试创建一个打开目录的程序,然后使用正则表达式获取powerpoint的名称,然后在本地创建文件并复制其内容.当我运行它似乎工作,但是当我实际尝试打开文件时,他们一直说版本是错误的.

from urllib.request import urlopen
import re

urlpath = urlopen('http://www.divms.uiowa.edu/~jni/courses/ProgrammignInCobol/presentation/')
string = urlpath.read().decode('utf-8')

pattern = re.compile('ch[0-9]*.ppt') #the pattern actually creates duplicates in the list

filelist = pattern.findall(string)
print(filelist)

for filename in filelist:
    remotefile = urlopen('http://www.divms.uiowa.edu/~jni/courses/ProgrammignInCobol/presentation/' + filename)
    localfile = open(filename,'wb')
    localfile.write(remotefile.read())
    localfile.close()
    remotefile.close()
Run Code Online (Sandbox Code Playgroud)

app*_*e16 9

这段代码对我有用.我只是稍微修改了一下,因为你的每个ppt文件都是重复的.

from urllib2 import urlopen
import re

urlpath =urlopen('http://www.divms.uiowa.edu/~jni/courses/ProgrammignInCobol/presentation/')
string = urlpath.read().decode('utf-8')

pattern = re.compile('ch[0-9]*.ppt"') #the pattern actually creates duplicates in the list

filelist = pattern.findall(string)
print(filelist)

for filename in filelist:
    filename=filename[:-1]
    remotefile = urlopen('http://www.divms.uiowa.edu/~jni/courses/ProgrammignInCobol/presentation/' + filename)
    localfile = open(filename,'wb')
    localfile.write(remotefile.read())
    localfile.close()
    remotefile.close()
Run Code Online (Sandbox Code Playgroud)