我安装了 pytube 来从一些 YouTube 视频中提取字幕。下面的代码都给出了 xml 标题。
from pytube import YouTube
yt = YouTube('https://www.youtube.com/watch?v=4ZQQofkz9eE')
caption = yt.captions['a.en']
print(caption.xml_captions)
Run Code Online (Sandbox Code Playgroud)
正如文档中提到的
yt = YouTube('http://youtube.com/watch?v=2lAe1cqCOXo')
caption = yt.captions.get_by_language_code('en')
caption.xml_captions
Run Code Online (Sandbox Code Playgroud)
但在这两种情况下,我都会得到 xml 输出,并且在使用时
print(caption.generate_srt_captions())
Run Code Online (Sandbox Code Playgroud)
我收到如下错误。你能帮忙看看如何提取srt格式吗?
KeyError
~/anaconda3/envs/myenv/lib/python3.6/site-packages/pytube/captions.py in
generate_srt_captions(self)
49 recompiles them into the "SubRip Subtitle" format.
50 """
51 return self.xml_caption_to_srt(self.xml_captions)
52
53 @staticmethod
~/anaconda3/envs/myenv/lib/python3.6/site-packages/pytube/captions.py in
xml_caption_to_srt(self, xml_captions)
81 except KeyError:
82 duration = 0.0
83 start = float(child.attrib["start"])
84 end = start + duration
85 sequence_number = i + 1 # convert …Run Code Online (Sandbox Code Playgroud)