如何使用 YouTube API 提取所有 YouTube 评论？（Python）

Question

如何使用 YouTube API 提取所有 YouTube 评论？（Python）

Sub*_*ham 7 python youtube list youtube-api

假设我有video_id意见8487。
此代码仅返回4309注释。

def get_comments(youtube, video_id, comments=[], token=''):

  video_response=youtube.commentThreads().list(part='snippet',
                                               videoId=video_id,
                                               pageToken=token).execute()
  for item in video_response['items']:
        comment = item['snippet']['topLevelComment']
        text = comment['snippet']['textDisplay']
        comments.append(text)
  if "nextPageToken" in video_response: 
    return get_comments(youtube, video_id, comments, video_response['nextPageToken'])
  else:
    return comments

youtube = build('youtube', 'v3',developerKey=api_key)
comment_threads = get_comments(youtube,video_id)
print(len(comment_threads))

> 4309

Run Code Online (Sandbox Code Playgroud)

如何提取所有8487评论？

Answer 1

Mar*_*yes 5

根据的答案commentThreads，您必须添加replies parameter in order to retrieve the replies the comments might have.

因此，您的请求应该如下所示：

video_response=youtube.commentThreads().list(part='id,snippet,replies',
                                               videoId=video_id,
                                               pageToken=token).execute()

Run Code Online (Sandbox Code Playgroud)

然后，相应地修改您的代码以读取replies of the comments.

在此示例中，我使用文档中提供的试用功能进行制作，您可以检查响应是否包含热门评论及其回复。

编辑 ( 08/04/2022):

创建一个新变量，其中包含可能totalReplyCount的内容topLevelComment have.

就像是：

def get_comments(youtube, video_id, comments=[], token=''): # Stores the total reply count a top level commnet has. totalReplyCount = 0 # Replies of the top-level comment might have. replies=[] video_response=youtube.commentThreads().list(part='snippet', videoId=video_id, pageToken=token).execute() for item in video_response['items']: comment = item['snippet']['topLevelComment'] text = comment['snippet']['textDisplay'] comments.append(text) # Get the total reply count: totalReplyCount = item['snippet']['totalReplyCount'] # Check if the total reply count is greater than zero, # if so,call the new function "getAllTopLevelCommentReplies(topCommentId, replies, token)" # and extend the "comments" returned list. if (totalReplyCount > 0): comments.extend(getAllTopLevelCommentReplies(comment['id'], replies, None)) # Clear variable - just in case - not sure if need due "get_comments" function initializes the variable. replies = [] if "nextPageToken" in video_response: return get_comments(youtube, video_id, comments, video_response['nextPageToken']) else: return comments
Run Code Online (Sandbox Code Playgroud)
然后，如果的值totalReplyCount大于零，则使用comment.list进行另一个调用，以带来顶级评论的回复。对于这个新调用，您必须传递顶级评论的 ID。

示例（未经测试）：

# Returns all replies the top-level comment has: # topCommentId = it's the id of the top-level comment you want to retrieve its replies. # replies = array of replies returned by this function. # token = the comments.list might return moren than 100 comments, if so, use the nextPageToken for retrieve the next batch of results. def getAllTopLevelCommentReplies(topCommentId, replies, token): replies_response=youtube.comments().list(part='snippet', maxResults=100, parentId=topCommentId pageToken=token).execute() for item in replies_response['items']: # Append the reply's text to the replies.append(item['snippet']['textDisplay']) if "nextPageToken" in replies_response: return getAllTopLevelCommentReplies(topCommentId, replies, replies_response['nextPageToken']) else: return replies
Run Code Online (Sandbox Code Playgroud)

编辑（2022 年 11 月 4 日）：

我添加了根据您的代码修改的Google Colab 示例，它与我的视频示例 ( ouf0ozwnU84 ) 配合使用 = 它带来了 130 条评论，但是，通过您的视频示例 ( BaGgScV4NN8 )，我得到了 3359 条评论中的 3300 条。

这可能是一些评论可能正在批准/审核中或我遗漏的其他内容，或者可能有评论太旧并且需要额外的过滤器，或者 API 有错误 -请参阅此处与使用分页所面临的问题相关的一些其他问题API - 我建议您查看本教程，其中显示了代码并且您可以更改它。

归档时间：	3 年，9 月前
查看次数：	4865 次
最近记录：	3 年，3 月前