7 python beautifulsoup python-requests
我正在尝试制作一个脚本,它将刮掉谷歌搜索的第一个链接,这样它只会返回第一个链接,这样我就可以在终端中运行搜索,然后在搜索词中查看链接.我很难得到第一个结果.这是我到目前为止最接近的事情.
import requests
from bs4 import BeautifulSoup
research_later = "hiya"
goog_search = "https://www.google.co.uk/search?sclient=psy-ab&client=ubuntu&hs=k5b&channel=fs&biw=1366&bih=648&noj=1&q=" + research_later
r = requests.get(goog_search)
soup = BeautifulSoup(r.text)
for link in soup.find_all('a'):
print research_later + " :"+link.get('href')
Run Code Online (Sandbox Code Playgroud)
Kev*_*uan 10
好像谷歌使用cite标签来保存链接,所以我们可以soup.find('cite').text像这样使用:
import requests
from bs4 import BeautifulSoup
research_later = "hiya"
goog_search = "https://www.google.co.uk/search?sclient=psy-ab&client=ubuntu&hs=k5b&channel=fs&biw=1366&bih=648&noj=1&q=" + research_later
r = requests.get(goog_search)
soup = BeautifulSoup(r.text, "html.parser")
print soup.find('cite').text
Run Code Online (Sandbox Code Playgroud)
输出是:
www.urbandictionary.com/define.php?term=hiya
Run Code Online (Sandbox Code Playgroud)