从谷歌搜索中提取结果数

Question

从谷歌搜索中提取结果数

Ros*_*e A 0 python beautifulsoup web-scraping

我正在编写一个网络抓取工具，以提取出现在搜索结果页面左上角的谷歌搜索中的搜索结果数量。我写了下面的代码，但我不明白为什么phrase_extract 是 None 。我想提取短语“大约 12,010,000,000 个结果”。我在哪一部分犯了错误？可能是 HTML 解析不正确？

import requests
from bs4 import BeautifulSoup

def pyGoogleSearch(word):   
    address='http://www.google.com/#q='
    newword=address+word
    #webbrowser.open(newword)
    page=requests.get(newword)
    soup = BeautifulSoup(page.content, 'html.parser')
    phrase_extract=soup.find(id="resultStats")
    print(phrase_extract)

pyGoogleSearch('world')

Run Code Online (Sandbox Code Playgroud)

Answer 1

wpe*_*rcy 5

您实际上使用了错误的网址来查询谷歌的搜索引擎。你应该使用http://www.google.com/search?q=<query>.

所以它看起来像这样：

def pyGoogleSearch(word):
    address = 'http://www.google.com/search?q='
    newword = address + word
    page = requests.get(newword)
    soup = BeautifulSoup(page.content, 'html.parser')
    phrase_extract = soup.find(id="resultStats")
    print(phrase_extract)

Run Code Online (Sandbox Code Playgroud)

您也可能只想要该元素的文本，而不是元素本身，因此您可以执行类似的操作

phrase_text = phrase_extract.text

Run Code Online (Sandbox Code Playgroud)

或获取整数的实际值：

val = int(phrase_extract.text.split(' ')[1].replace(',',''))

Run Code Online (Sandbox Code Playgroud)

归档时间：	7 年前
查看次数：	5625 次
最近记录：	3 年，7 月前