如何使用 youtube-dl 从播放列表中的 Youtube 视频中提取上传日期、标题、URL 和持续时间?

Lod*_*Lod 6 windows youtube json python-3.x youtube-dl

我正在尝试从特定播放列表的所有 Youtube 视频中提取Upload DatesTitlesURLs和,我不需要视频 - 只需上面的数据。Durationsyoutube-dl

\n

到目前为止,我已经测试了Alen Paul Varghese此处建议的以下两种方法:

\n

Youtube-dl 的 GitHub Doc 用作参考

\n

用于测试的播放列表 URL

\n

方法#1

\n
youtube-dl --skip-download --print-json https://www.youtube.com/playlist?list=PLRqwX-V7Uu6by61pbhdvyEpIeymlmnXzD > example.json\n
Run Code Online (Sandbox Code Playgroud)\n

\n

方法#2

\n
youtube-dl --get-upload_date https://www.youtube.com/playlist?list=PLRqwX-V7Uu6by61pbhdvyEpIeymlmnXzD > example.txt\n
Run Code Online (Sandbox Code Playgroud)\n

方法 #1输出整个 json 转储 \xe2\x80\x94 每个视频大约 3000 行 - 处理大量 Youtube 视频播放列表非常不方便 - 但它包含 4 个所需的数据。

\n

方法 #2返回以下错误:

\n
youtube-dl: error: no such option: --get-upload_date\n
Run Code Online (Sandbox Code Playgroud)\n

我想支持方法#2,将输出数据限制为所需的数据(upload datesTitlesURLsDurations,遵循 Alen Paul Varghese\ 的第二个建议,并在检查了Youtube-dl\ 的 GitHub Doc 使用的upload_date有效youtube-dl选项后作为参考

\n

为什么该选项没有upload_data得到验证?

\n

有什么替代方法可以限制数据?

\n

我非常感谢您的有用建议。

\n

这是 json 转储文件:\n example.json

\n
\n

编辑(感谢@PIERPY 伟大的指导 -\n完整记录的免费过程 -\n对其他人有帮助):

\n
\n

我已成功安装Chocolatey NuGet,并按照下载 jq - Windows的要求Admin CMD安装 jq 1.5chocolatey install jq

\n

我的Chocolatey NuGet安装输出:

\n
    Microsoft Windows [Version 10.0.19042.867]\n(c) 2020 Microsoft Corporation. All rights reserved.\nC:\\WINDOWS\\system32>@"%SystemRoot%\\System32\\WindowsPowerShell\\v1.0\\powershell.exe" -NoProfile -InputFormat None -ExecutionPolicy Bypass -Command "iex ((New-Object System.Net.WebClient).DownloadString(\'https://community.chocolatey.org/install.ps1\'))" && SET "PATH=%PATH%;%ALLUSERSPROFILE%\\chocolatey\\bin"                                                         \nForcing web requests to allow TLS v1.2 (Required for requests to Chocolatey.org)                                        \nGetting latest version of the Chocolatey package for download.                                                          \nNot using proxy.\nGetting Chocolatey from https://community.chocolatey.org/api/v2/package/chocolatey/0.10.15.\nDownloading https://community.chocolatey.org/api/v2/package/chocolatey/0.10.15 to C:\\Users\\###\\AppData\\Local\\Temp\\chocolatey\\chocoInstall\\chocolatey.zip\nNot using proxy.\nExtracting C:\\Users\\###\\AppData\\Local\\Temp\\chocolatey\\chocoInstall\\chocolatey.zip to C:\\Users\\###\\AppData\\Local\\Temp\\chocolatey\\chocoInstall\nInstalling Chocolatey on the local machine\nCreating ChocolateyInstall as an environment variable (targeting \'Machine\')\n  Setting ChocolateyInstall to \'C:\\ProgramData\\chocolatey\'\nWARNING: It\'s very likely you will need to close and reopen your shell\n  before you can use choco.\nRestricting write permissions to Administrators\nWe are setting up the Chocolatey package repository.\nThe packages themselves go to \'C:\\ProgramData\\chocolatey\\lib\'\n  (i.e. C:\\ProgramData\\chocolatey\\lib\\yourPackageName).\nA shim file for the command line goes to \'C:\\ProgramData\\chocolatey\\bin\'\n  and points to an executable in \'C:\\ProgramData\\chocolatey\\lib\\yourPackageName\'.\n\nCreating Chocolatey folders if they do not already exist.\n\nWARNING: You can safely ignore errors related to missing log files when\n  upgrading from a version of Chocolatey less than 0.9.9.\n  \'Batch file could not be found\' is also safe to ignore.\n  \'The system cannot find the file specified\' - also safe.\nchocolatey.nupkg file not installed in lib.\n Attempting to locate it from bootstrapper.\nPATH environment variable does not have C:\\ProgramData\\chocolatey\\bin in it. Adding...\nWARNING: Not setting tab completion: Profile file does not exist at \'C:\\Users\\###\\Documents\\WindowsPowerShell\\Microsoft.PowerShell_profile.ps1\'.\nChocolatey (choco.exe) is now ready.\nYou can call choco from anywhere, command line or powershell by typing choco.\nRun choco /? for a list of functions.\nYou may need to shut down and restart powershell and/or consoles\n first prior to using choco.\nEnsuring Chocolatey commands are on the path\nEnsuring chocolatey.nupkg is in the lib folder\n\nC:\\WINDOWS\\system32>\n
Run Code Online (Sandbox Code Playgroud)\n

然后我运行chocolatey install jq并成功安装它:

\n

我的jq安装输出:

\n
    C:\\WINDOWS\\system32>chocolatey install jq\nChocolatey v0.10.15\nInstalling the following packages:\njq\nBy installing you accept licenses for the packages.\nProgress: Downloading jq 1.6... 100%\n\njq v1.6 [Approved]\njq package files install completed. Performing other installation steps.\nThe package jq wants to run \'chocolateyinstall.ps1\'.\nNote: If you don\'t run this script, the installation will fail.\nNote: To confirm automatically next time, use \'-y\' or consider:\nchoco feature enable -n allowGlobalConfirmation\nDo you want to run the script?([Y]es/[A]ll - yes to all/[N]o/[P]rint): Y\n\nDownloading jq 64 bit\n  from \'https://github.com/stedolan/jq/releases/download/jq-1.6/jq-win64.exe\'\nProgress: 100% - Completed download of C:\\ProgramData\\chocolatey\\lib\\jq\\tools\\jq.exe (3.36 MB).\nDownload of jq.exe (3.36 MB) completed.\nHashes match.\nC:\\ProgramData\\chocolatey\\lib\\jq\\tools\\jq.exe\n ShimGen has successfully created a shim for jq.exe\n The install of jq was successful.\n  Software install location not explicitly set, could be in package or\n  default install location if installer.\n\nChocolatey installed 1/1 packages.\n See the log for details (C:\\ProgramData\\chocolatey\\logs\\chocolatey.log).\n
Run Code Online (Sandbox Code Playgroud)\n

然后我运行了您的@pierpyyoutube-dl 命令:

\n
youtube-dl --skip-download --print-json https://www.youtube.com/playlist?list=PLRqwX-V7Uu6by61pbhdvyEpIeymlmnXzD | jq \'{"date": .upload_date,"title": .title,"URL": .url,"duration": .duration}\'\n
Run Code Online (Sandbox Code Playgroud)\n

并出现语法错误,输出如下:

\n
    Microsoft Windows [Version 10.0.19042.867]\n(c) 2020 Microsoft Corporation. All rights reserved.\n\nC:\\Users\\###>cd documents\n\nC:\\Users\\###\\Documents>cd youtube-dl\n\nC:\\Users\\###\\Documents\\youtube-dl>youtube-dl --skip-download --print-json https://www.youtube.com/playlist?list=PLRqwX-V7Uu6by61pbhdvyEpIeymlmnXzD | jq \'{"date": .upload_date,"title": .title,"URL": .url,"duration": .duration}\'\njq: error: syntax error, unexpected INVALID_CHARACTER, expecting $end (Windows cmd shell quoting issues?) at <top-level>, line 1:\n\'{date:\njq: 1 compile error\nTraceback (most recent call last):\n  File "__main__.py", line 19, in <module>\n  File "C:\\Users\\dst\\AppData\\Roaming\\Build archive\\youtube-dl\\ytdl-org\\tmpwt56m8wg\\build\\youtube_dl\\__init__.py", line 475, in main\n  File "C:\\Users\\dst\\AppData\\Roaming\\Build archive\\youtube-dl\\ytdl-org\\tmpwt56m8wg\\build\\youtube_dl\\__init__.py", line 465, in _real_main\n  File "C:\\Users\\dst\\AppData\\Roaming\\Build archive\\youtube-dl\\ytdl-org\\tmpwt56m8wg\\build\\youtube_dl\\YoutubeDL.py", line 2060, in download\n  File "C:\\Users\\dst\\AppData\\Roaming\\Build archive\\youtube-dl\\ytdl-org\\tmpwt56m8wg\\build\\youtube_dl\\YoutubeDL.py", line 799, in extract_info\n  File "C:\\Users\\dst\\AppData\\Roaming\\Build archive\\youtube-dl\\ytdl-org\\tmpwt56m8wg\\build\\youtube_dl\\YoutubeDL.py", line 806, in wrapper\n  File "C:\\Users\\dst\\AppData\\Roaming\\Build archive\\youtube-dl\\ytdl-org\\tmpwt56m8wg\\build\\youtube_dl\\YoutubeDL.py", line 838, in __extract_info\n  File "C:\\Users\\dst\\AppData\\Roaming\\Build archive\\youtube-dl\\ytdl-org\\tmpwt56m8wg\\build\\youtube_dl\\YoutubeDL.py", line 924, in process_ie_result\n  File "C:\\Users\\dst\\AppData\\Roaming\\Build archive\\youtube-dl\\ytdl-org\\tmpwt56m8wg\\build\\youtube_dl\\YoutubeDL.py", line 1058, in __process_playlist\n  File "C:\\Users\\dst\\AppData\\Roaming\\Build archive\\youtube-dl\\ytdl-org\\tmpwt56m8wg\\build\\youtube_dl\\YoutubeDL.py", line 806, in wrapper\n  File "C:\\Users\\dst\\AppData\\Roaming\\Build archive\\youtube-dl\\ytdl-org\\tmpwt56m8wg\\build\\youtube_dl\\YoutubeDL.py", line 1068, in __process_iterable_entry\n  File "C:\\Users\\dst\\AppData\\Roaming\\Build archive\\youtube-dl\\ytdl-org\\tmpwt56m8wg\\build\\youtube_dl\\YoutubeDL.py", line 910, in process_ie_result\n  File "C:\\Users\\dst\\AppData\\Roaming\\Build archive\\youtube-dl\\ytdl-org\\tmpwt56m8wg\\build\\youtube_dl\\YoutubeDL.py", line 872, in process_ie_result\n  File "C:\\Users\\dst\\AppData\\Roaming\\Build archive\\youtube-dl\\ytdl-org\\tmpwt56m8wg\\build\\youtube_dl\\YoutubeDL.py", line 1683, in process_video_result\n  File "C:\\Users\\dst\\AppData\\Roaming\\Build archive\\youtube-dl\\ytdl-org\\tmpwt56m8wg\\build\\youtube_dl\\YoutubeDL.py", line 1793, in process_info\n  File "C:\\Users\\dst\\AppData\\Roaming\\Build archive\\youtube-dl\\ytdl-org\\tmpwt56m8wg\\build\\youtube_dl\\YoutubeDL.py", line 1765, in __forced_printings\n  File "C:\\Users\\dst\\AppData\\Roaming\\Build archive\\youtube-dl\\ytdl-org\\tmpwt56m8wg\\build\\youtube_dl\\YoutubeDL.py", line 520, in to_stdout\n  File "C:\\Users\\dst\\AppData\\Roaming\\Build archive\\youtube-dl\\ytdl-org\\tmpwt56m8wg\\build\\youtube_dl\\YoutubeDL.py", line 509, in _write_string\n  File "C:\\Users\\dst\\AppData\\Roaming\\Build archive\\youtube-dl\\ytdl-org\\tmpwt56m8wg\\build\\youtube_dl\\utils.py", line 3180, in write_string\nOSError: [Errno 22] Invalid argument\n\nC:\\Users\\###\\Documents\\youtube-dl>\n
Run Code Online (Sandbox Code Playgroud)\n

然后我用谷歌搜索了错误

\n

jq: error: syntax error, unexpected INVALID_CHARACTER, expecting $end (Windows cmd shell quoting issues?)

\n

并从这个建议中发现了见解:

\n

一切都与引用有关

\n

然后,我相应地将您的@pierpyyoutube-dl 命令单引号改为双引号:

\n
youtube-dl --skip-download --print-json https://www.youtube.com/playlist?list=PLRqwX-V7Uu6by61pbhdvyEpIeymlmnXzD | jq "{"date": .upload_date,"title": .title,"URL": .url,"duration": .duration}"\n
Run Code Online (Sandbox Code Playgroud)\n

现在它根据需要输出数据Upload DatesTitlesURLs和。Durations

\n

最终输出:

\n
C:\\Users\\###\\Documents\\youtube-dl>youtube-dl --skip-download --print-json https://www.youtube.com/playlist?list=PLRqwX-V7Uu6by61pbhdvyEpIeymlmnXzD | jq "{"date": .upload_date,"title": .title,"URL": .url,"duration": .duration}"\n{\n  "date": "20150717",\n  "title": "3.1: Flow (setup and draw) - Processing Tutorial",\n  "URL": "https://r1---sn-n0ogpnx-b85s.googlevideo.com/videoplayback?expire=1617730292&ei=lEZsYKDoEZmAp-oP3ayk8AI&ip=188.154.162.181&id=o-AHFxnOR5c5xqmgtu1JG4FbL6lJW0gz1pJQN77cr2-27T&itag=22&source=youtube&requiressl=yes&mh=m6&mm=31%2C29&mn=sn-n0ogpnx-b85s%2Csn-1gieen7e&ms=au%2Crdu&mv=m&mvi=1&pl=23&initcwndbps=1578750&vprv=1&mime=video%2Fmp4&ns=r3pR-nwt6FkDQa33iQQu-qgF&ratebypass=yes&dur=944.007&lmt=1607684088067796&mt=1617708538&fvip=5&fexp=24001373%2C24007246&beids=9466585&c=WEB&txp=5432434&n=3P6HQoLfY8ktFLG5&sparams=expire%2Cei%2Cip%2Cid%2Citag%2Csource%2Crequiressl%2Cvprv%2Cmime%2Cns%2Cratebypass%2Cdur%2Clmt&sig=AOq0QJ8wRgIhAMiNOv8QDjfsn7yxicEOtSjcEYjZlX3CfrI8D-HGBd63AiEA4E6rKv_kYti6rAeieJzPAdTYjoh05Az_11Kcxt-0jBg%3D&lsparams=mh%2Cmm%2Cmn%2Cms%2Cmv%2Cmvi%2Cpl%2Cinitcwndbps&lsig=AG3C_xAwRAIgD43F71OxMExfQyN9FeNWfZX_aiGAD3SKlKOLNR14NT8CICEuD_Ry0oymKZmFfHuP4F6v9MKCrmRI0x27sLG8fvyG",\n  "duration": 944\n}\n{\n  "date": "20150717",\n  "title": "3.2: Built-in Variables (mouseX, mouseY) - Processing Tutorial",\n  "URL": "https://r4---sn-n0ogpnx-b85l.googlevideo.com/videoplayback?expire=1617730293&ei=lEZsYMO2OczSWaPiueAC&ip=188.154.162.181&id=o-ANuT73vsKQLvQqynOeh00stVP-zqbq3x-iUrdDiYwg8E&itag=22&source=youtube&requiressl=yes&mh=kE&mm=31%2C29&mn=sn-n0ogpnx-b85l%2Csn-1gieen7e&ms=au%2Crdu&mv=m&mvi=4&pl=23&initcwndbps=1617500&vprv=1&mime=video%2Fmp4&ns=tPtC_l82gq-yi-rk_oQXatAF&cnr=14&ratebypass=yes&dur=814.207&lmt=1551720899437893&mt=1617708538&fvip=5&fexp=24001373%2C24007246&beids=9466585&c=WEB&txp=5432432&n=LhJHXWU8TGNOrD9u&sparams=expire%2Cei%2Cip%2Cid%2Citag%2Csource%2Crequiressl%2Cvprv%2Cmime%2Cns%2Ccnr%2Cratebypass%2Cdur%2Clmt&sig=AOq0QJ8wRAIgSHTlBPN0j49hoB02SYDeF3-9fe1iSz1KRiv9iFy8nj0CIHEafdAOBefsos8kO5FGhDljsKpOV7ZQ9dY1BEzQQ0n0&lsparams=mh%2Cmm%2Cmn%2Cms%2Cmv%2Cmvi%2Cpl%2Cinitcwndbps&lsig=AG3C_xAwRgIhAJkd-9posqapJekca_35YNG0g3nLgxTfW06EqRM-a3wDAiEApSrsS5wPlMPXjlI_bvOh53cjxlrHfNSKD4XbhyDyZ6w%3D",\n  "duration": 815\n}\n{\n  "date": "20150717",\n  "title": "3.3: Events (mousePressed, keyPressed) - Processing Tutorial",\n  "URL": "https://r4---sn-n0ogpnx-b85l.googlevideo.com/videoplayback?expire=1617730293&ei=lUZsYK6WJ4TeWaeflbgF&ip=188.154.162.181&id=o-AD1WgS46WiFogy00v3aHRp6aZXkd_ACN-_m76lPoQvA8&itag=22&source=youtube&requiressl=yes&mh=it&mm=31%2C29&mn=sn-n0ogpnx-b85l%2Csn-1gieen7e&ms=au%2Crdu&mv=m&mvi=4&pl=23&initcwndbps=1617500&vprv=1&mime=video%2Fmp4&ns=AlyS4uv2BH5ENfp_nP53I-sF&cnr=14&ratebypass=yes&dur=441.225&lmt=1472343659978757&mt=1617708538&fvip=4&fexp=24001373%2C24007246&beids=9466585&c=WEB&n=np6rmmeSKhYEvG1K&sparams=expire%2Cei%2Cip%2Cid%2Citag%2Csource%2Crequiressl%2Cvprv%2Cmime%2Cns%2Ccnr%2Cratebypass%2Cdur%2Clmt&sig=AOq0QJ8wRgIhAIRmvxmY-VidN3LPhnzCNQ2TLsUB_7i1yU0QOMBVUS6AAiEAm9DE-Kk6cCNb8FC0we4c2O8299n2_2jGnQfzYzz0igo%3D&lsparams=mh%2Cmm%2Cmn%2Cms%2Cmv%2Cmvi%2Cpl%2Cinitcwndbps&lsig=AG3C_xAwRQIgZzrGEwMcb0Vrj9FleanW2apPMu_55OdH2SRdw66DQ1QCIQCDsAz7X5RxczKtWzokBhyUNcyXLXeZF-ENufpjA0BP2Q%3D%3D",\n  "duration": 442\n}\n\nC:\\Users\\###\\Documents\\youtube-dl>\n
Run Code Online (Sandbox Code Playgroud)\n
\n

上一期:

\n
\n

获取的内容URLs不显示标准视频。\n为什么不呢?

\n

Youtube-dl 的 GitHub 文档用作参考中,它指出:

\n
url (string): Video URL\n
Run Code Online (Sandbox Code Playgroud)\n

如何检索标准 Youtube 视频 URL?

\n

上一期答案:

\n

我刚刚查看了example.json昨天生成的文件,发现标准 Youtube 视频 URL 接受webpage_url代替url.

\n
\n

最终 YouTube-DL 输出:

\n
\n
C:\\Users\\###\\Documents\\youtube-dl>youtube-dl --skip-download --print-json https://www.youtube.com/playlist?list=PLRqwX-V7Uu6by61pbhdvyEpIeymlmnXzD | jq "{"date": .upload_date,"title": .title,"URL": .webpage_url,"duration": .duration}"\n{\n  "date": "20150717",\n  "title": "3.1: Flow (setup and draw) - Processing Tutorial",\n  "URL": "https://www.youtube.com/watch?v=o8dffrZ86gs",\n  "duration": 944\n}\n{\n  "date": "20150717",\n  "title": "3.2: Built-in Variables (mouseX, mouseY) - Processing Tutorial",\n  "URL": "https://www.youtube.com/watch?v=ibW4oA7-n8I",\n  "duration": 815\n}\n{\n  "date": "20150717",\n  "title": "3.3: Events (mousePressed, keyPressed) - Processing Tutorial",\n  "URL": "https://www.youtube.com/watch?v=UvSjtiW-RH8",\n  "duration": 442\n}\n\nC:\\Users\\###\\Documents\\youtube-dl>\n
Run Code Online (Sandbox Code Playgroud)\n

获取 JSON 文件中的最终输出:

\n
youtube-dl --skip-download --print-json https://www.youtube.com/playlist?list=PLRqwX-V7Uu6by61pbhdvyEpIeymlmnXzD | jq "{"date": .upload_date,"title": .title,"URL": .webpage_url,"duration": .duration}" > example.json\n
Run Code Online (Sandbox Code Playgroud)\n

pie*_*rpy 5

您需要使用方便的工具来过滤输出,例如jq
粘贴此命令行:您可以从https://stedolan.github.io/jq/download/
youtube-dl --skip-download --print-json https://www.youtube.com/playlist?list=PLRqwX-V7Uu6by61pbhdvyEpIeymlmnXzD | jq '{"date": .upload_date,"title": .title,"URL": .url,"duration": .duration}'
获取jq

更新

该密钥"webpage_url"包含标准 YouTube URL(如果需要)。要获取各种可能的键的完整列表,请运行:
youtube-dl --skip-download --print-json https://www.youtube.com/playlist?list=PLRqwX-V7Uu6by61pbhdvyEpIeymlmnXzD | jq keys
这会给出原始 JSON 中的完整键名称。