从特定频道抓取 YouTube 视频并进行搜索?

Kaz*_*z25 4 python youtube beautifulsoup web-scraping

我正在使用此代码来获取 youtube 频道的网址,它工作正常,但我想添加一个选项来搜索频道中具有特定标题的视频。并获取您使用搜索词组找到的第一个视频的网址

from bs4 import BeautifulSoup
import requests

url="https://www.youtube.com/feeds/videos.xml?user=LinusTechTips"
html = requests.get(url)
soup = BeautifulSoup(html.text, "lxml")

for entry in soup.find_all("entry"):
    for link in entry.find_all("link"):
        print(link["href"])

Run Code Online (Sandbox Code Playgroud)

Pey*_*idi 8

在我的最后一个答案中,您将获得给定 youtube 频道中的所有视频标题,正如您要查找的那样 但是在我们之间的评论中,您告诉我您想通过 运行脚本cronjob,这需要更多的努力,所以我添加了另一个答案。

from bs4 import BeautifulSoup
from lxml import etree
import urllib
import requests
import sys

def fetch_titles(url):
    video_titles = []
    html = requests.get(url)
    soup = BeautifulSoup(html.text, "lxml")
    for entry in soup.find_all("entry"):
        for link in entry.find_all("link"):
            youtube = etree.HTML(urllib.request.urlopen(link["href"]).read()) 
            video_title = youtube.xpath("//span[@id='eow-title']/@title") 
            if len(video_title)>0:
                video_titles.append({"title":video_title[0], "url":link.attrs["href"]})
    return video_titles

def main():
    if sys.argv.__len__() == 1:
        print("Error: You should specifying keyword")
        print("eg: python3 ./main.py KEYWORD")
        return

    url="https://www.youtube.com/feeds/videos.xml?user=LinusTechTips"
    keyword = sys.argv[1]

    video_titles = fetch_titles(url)
    for video in video_titles:
        if video["title"].__contains__(keyword):
            print(video["url"])
            break # add this line, if you want to print the first match only


if __name__ == "__main__":
    main()
Run Code Online (Sandbox Code Playgroud)

当您通过终端调用脚本时,您应该指定关键字,如下所示:

$ python3 ./main.py Mac

哪个Mac是关键字,main.py是python脚本文件名

输出:

https://www.youtube.com/watch?v=l_IHSRPVqwQ