相关疑难解决方法(0)

将标头添加到python请求模块

之前我使用httplib模块在请求中添加标头.现在我正在尝试与requests模块相同的事情.

这是我正在使用的python请求模块:http: //pypi.python.org/pypi/requests

如何添加标题,request.postrequest.get说我必须foobar在标题中的每个请求中添加密钥.

python http-headers python-requests

83
推荐指数
2
解决办法
13万
查看次数

远端关闭连接无响应

我正在尝试使用以下代码从网页获取 HTML 源代码:

import requests
url = "https://dictionary.cambridge.org/us/dictionary/english-arabic/hi"
r = requests.get(url)
Run Code Online (Sandbox Code Playgroud)

但是,我收到以下错误:

Traceback (most recent call last):
  File "/home/username/ak_env/lib/python3.8/site-packages/urllib3/connectionpool.py", line 699, in urlopen
    httplib_response = self._make_request(
  File "/home/username/ak_env/lib/python3.8/site-packages/urllib3/connectionpool.py", line 445, in _make_request
    six.raise_from(e, None)
  File "<string>", line 3, in raise_from
  File "/home/username/ak_env/lib/python3.8/site-packages/urllib3/connectionpool.py", line 440, in _make_request
    httplib_response = conn.getresponse()
  File "/usr/lib/python3.8/http/client.py", line 1347, in getresponse
    response.begin()
  File "/usr/lib/python3.8/http/client.py", line 307, in begin
    version, status, reason = self._read_status()
  File "/usr/lib/python3.8/http/client.py", line 276, in _read_status
    raise RemoteDisconnected("Remote end closed connection without"
http.client.RemoteDisconnected: …
Run Code Online (Sandbox Code Playgroud)

python python-requests

11
推荐指数
1
解决办法
5万
查看次数

如何在请求中添加标头

是否有任何其他优雅的方式为请求添加标头:

import requests

requests.get(url,headers={'Authorization', 'GoogleLogin auth=%s' % authorization_token}) 
Run Code Online (Sandbox Code Playgroud)

不起作用,而urllib2工作:

import urllib2

request = urllib2.Request('http://maps.google.com/maps/feeds/maps/default/full')
request.add_header('Authorization', 'GoogleLogin auth=%s' % authorization_token)
urllib2.urlopen(request).read()
Run Code Online (Sandbox Code Playgroud)

python python-requests

6
推荐指数
2
解决办法
2万
查看次数

从<a>美丽的汤中提取href

我正在尝试从谷歌搜索结果中提取链接.Inspect元素告诉我,我感兴趣的部分有"class = r".第一个结果如下:

<h3 class="r" original_target="https://en.wikipedia.org/wiki/chocolate" style="display: inline-block;">
    <a href="https://en.wikipedia.org/wiki/Chocolate" 
       ping="/url?sa=t&amp;source=web&amp;rct=j&amp;url=https://en.wikipedia.org/wiki/Chocolate&amp;ved=0ahUKEwjW6tTC8LXZAhXDjpQKHSXSClIQFgheMAM" 
       saprocessedanchor="true">
        Chocolate - Wikipedia
    </a>
</h3>
Run Code Online (Sandbox Code Playgroud)

要提取"href"我做:

import bs4, requests
res = requests.get('https://www.google.com/search?q=chocolate')
googleSoup = bs4.BeautifulSoup(res.text, "html.parser")
elements= googleSoup.select(".r a")
elements[0].get("href")
Run Code Online (Sandbox Code Playgroud)

但我意外得到:

'/url?q=https://en.wikipedia.org/wiki/Chocolate&sa=U&ved=0ahUKEwjHjrmc_7XZAhUME5QKHSOCAW8QFggWMAA&usg=AOvVaw03f1l4EU9fYd'
Run Code Online (Sandbox Code Playgroud)

我想要的地方:

"https://en.wikipedia.org/wiki/Chocolate"

属性"ping"似乎令人困惑.有任何想法吗?

python beautifulsoup

4
推荐指数
1
解决办法
4129
查看次数