相关疑难解决方法(0)

从Google搜索结果中抓取数据是否可以？

我想使用curl从Google获取结果,以检测潜在的重复内容.是否存在被Google禁止的高风险？

web-scraping

ML_*_*ML_

lucky-day

60
推荐指数

3
解决办法

9万
查看次数

在shell脚本中获取第一个Google搜索结果的网址

使用脚本语言解析AJAX API的输出相对容易:

#!/usr/bin/env python

import urllib
import json

base = 'http://ajax.googleapis.com/ajax/services/search/web?v=1.0&'
query = urllib.urlencode({'q' : "something"})
response = urllib.urlopen(base + query).read()
data = json.loads(response)
print data['responseData']['results'][0]['url']

Run Code Online (Sandbox Code Playgroud)

但有没有更好的方法来做类似的基本shell脚本？如果你只是卷曲了API页面,你应该如何编码URL参数或解析JSON？

bash

Lri*_*Lri

2012 06-29

9
推荐指数

2
解决办法

2万
查看次数