我正在编写一个简单的脚本来下载.mp4 TEDTalks给出一个TEDTalk网站链接列表:
# Run through a list of TEDTalk website links and download each
# TEDTalk in high quality MP4
import urllib.request
#List of website links
l = [
"http://www.ted.com/index.php/talks/view/id/28",
"http://www.ted.com/index.php/talks/view/id/29",
]
# Function which takes the location of the string "-480p.mp4",
# d = 1 less that location, and a string and returns the
# full movie download link
def findFullURL(d, e, s):
a = s[d]
if a != "/":
#Subtract from d to move back another letter
d = d - 1
findFullURL(d, e, s)
else:
fullURL = "http://download.ted.com/talks/" + s[(d+1):e] + "-480p.mp4"
#print(fullURL)
return fullURL
#Iterate through a list of links to download each movie
def iterateList(l):
for x in l:
#get the HTML
f = urllib.request.urlopen(x)
#Convert the HTML file into a string
s = str(f.read(), "utf-8")
f.close()
#Find the location in the string where the interesting bit ends
e = s.find("-480p.mp4")
d = e - 1
#The problem is with this variable url:
url = findFullURL(d, e, s)
print("Downloading " + url)
#TODO: Download the file
Run Code Online (Sandbox Code Playgroud)
我确信函数findFullURL有效.如果取消注释函数print(fullURL)末尾的行findFullURL,您将看到它完全按照我的需要输出下载链接.
但是,在iterateList我尝试捕获该字符串的函数中url = findFullURL(d, e, s),变量url似乎采用了该值None.我根本不明白这一点.它应该像下面的例子一样简单,当我在解释器中尝试时它可以工作:
def hello():
return "Hello"
url = hello()
print(url)
Run Code Online (Sandbox Code Playgroud)
Mar*_*ers 10
我确信函数findFullURL有效.
确保某段代码可以正常运行,这是浪费数小时调试时间在错误位置的最佳方法.
事实上,该功能不起作用.你错过了一个回报:
def findFullURL(d, e, s):
a = s[d]
if a != "/":
#Subtract from d to move back another letter
d = d - 1
return findFullURL(d, e, s) # <<<<<<< here
else:
fullURL = "http://download.ted.com/talks/" + s[(d+1):e] + "-480p.mp4"
#print(fullURL)
return fullURL
Run Code Online (Sandbox Code Playgroud)
此外,您不应该使用递归来解决此任务.你可以rfind改用.
def findFullURL(d, e, s):
d = s.rfind('/', 0, d + 1)
# You probably want to handle the condition where '/' is not found here.
return "http://download.ted.com/talks/" + s[(d+1):e] + "-480p.mp4"
Run Code Online (Sandbox Code Playgroud)