标签: httplib

当我使用python请求检查网站时,如果网站将我重定向到另一个页面,我会知道吗？

我的意思是,如果我去"www.yahoo.com/thispage",雅虎已经设置了一个过滤器来重定向/ thispage到/ thatpage.因此,每当有人访问/ thispage时,他/她将登陆/该页面.

如果我使用httplib/requests/urllib,它会知道有重定向吗？什么错误页面？无论何时找不到页面,某些站点都会将用户重定向到/ errorpage.

python httplib python-requests

13
推荐指数

2
解决办法

9956
查看次数

调试与urllib2 + httplib.debuglevel的连接有时不显示调试信息

试图让登录脚本工作,我不断返回相同的登录页面,所以我打开了http流的调试(由于https,不能使用wireshark等).

我什么都没有,所以我复制了这个例子,它有效.对google.com的任何查询都有效,但是我的目标页面没有显示调试,有什么区别？如果是重定向,我希望看到第一个获取/重定向标头,http:// google重定向也是如此.

import urllib
import urllib2
import pdb

h=urllib2.HTTPHandler(debuglevel=1)
opener = urllib2.build_opener(h)
urllib2.install_opener(opener)
print '================================'
data = urllib2.urlopen('http://google.com').read()
print '================================'
data = urllib2.urlopen('https://google.com').read()
print '================================'
data = urllib2.urlopen('https://members.poolplayers.com/default.aspx').read()
print '================================'
data = urllib2.urlopen('https://google.com').read()

Run Code Online (Sandbox Code Playgroud)

当我跑步时,我得到了这个.

$ python ex.py 
================================
send: 'GET / HTTP/1.1\r\nAccept-Encoding: identity\r\nHost: google.com\r\nConnection: close\r\nUser-Agent: Python-urllib/2.7\r\n\r\n'
reply: 'HTTP/1.1 301 Moved Permanently\r\n'
header: Location: http://www.google.com/
header: Content-Type: text/html; charset=UTF-8
header: Date: Sat, 02 Jul 2011 16:20:11 GMT
header: Expires: Mon, 01 Aug 2011 16:20:11 GMT
header: Cache-Control: public, …

Run Code Online (Sandbox Code Playgroud)

python urllib2 httplib

10
推荐指数

1
解决办法

5439
查看次数

Selenium无头浏览器webdriver [Errno 104]连接由同行重置

我试图从下面的URL中抓取数据.但是,有时候有时会driver.get(url)出现错误.在极少数情况下,它工作正常,在我的Mac上使用真正的浏览器,同一个蜘蛛每次都可以正常工作.所以这与我无关.[Errno 104] Connection reset by peer[Errno 111] Connection refusedspider

尝试了很多解决方案,比如在页面上等待选择器,隐式等待,使用selenium-requests和传递正确的请求标头等等.但似乎没有任何工作.

http://www.snapdeal.com/offers/deal-of-the-day
https://paytm.com/shop/g/paytm-home/exclusive-discount-deals

Run Code Online (Sandbox Code Playgroud)

我正在使用python,selenium并headless Firefox webdriver实现这一目标.操作系统是centos 6.5.

注意:我有很多AJAX重页被成功抓取,有些是在下面.

http://www.infibeam.com/deal-of-the-day.html, http://www.amazon.in/gp/goldbox/ref=nav_topnav_deals

Run Code Online (Sandbox Code Playgroud)

已经花了很多天试图调试问题没有运气.任何帮助,将不胜感激.

python selenium httplib selenium-webdriver centos6.5

10
推荐指数

3
解决办法

6762
查看次数

python httplib/urllib获取文件名

是否有可能获得文件名

e.g. xyz.com/blafoo/showall.html

Run Code Online (Sandbox Code Playgroud)

如果你使用urllib或httplib？

这样我可以将文件保存在服务器上的文件名下？

如果你去像这样的网站

xyz.com/blafoo/

Run Code Online (Sandbox Code Playgroud)

你看不到文件名.

谢谢

python urllib httplib

9
推荐指数

1
解决办法

9654
查看次数

如何使用httplib发布unicode字符？

我尝试使用以下httplib.request函数发布unicode数据:

s = u"?????"
data = """
<spellrequest textalreadyclipped="0" ignoredups="1" ignoredigits="1" ignoreallcaps="0">
<text>%s</text>
</spellrequest>
""" % s

con = httplib.HTTPSConnection("www.google.com")
con.request("POST", "/tbproxy/spell?lang=he", data)
response = con.getresponse().read()

Run Code Online (Sandbox Code Playgroud)

但是这是我的错误:

Traceback (most recent call last):
  File "C:\Scripts\iQuality\test.py", line 47, in <module>
    print spellFix(u"?á???¿??????")
  File "C:\Scripts\iQuality\test.py", line 26, in spellFix
    con.request("POST", "/tbproxy/spell?lang=%s" % lang, data)
  File "C:\Python27\lib\httplib.py", line 955, in request
    self._send_request(method, url, body, headers)
  File "C:\Python27\lib\httplib.py", line 989, in _send_request
    self.endheaders(body)
  File "C:\Python27\lib\httplib.py", line 951, in endheaders
    self._send_output(message_body)
  File "C:\Python27\lib\httplib.py", line …

Run Code Online (Sandbox Code Playgroud)

python unicode httplib

8
推荐指数

1
解决办法

5895
查看次数

嵌套线程SimpleXMLRPCServers时的python:httplib.CannotSendRequest

我在使用一串使用SocketServer.ThreadingMixin的SimpleXMLRPCServers时间歇性地收到httplib.CannotSendRequest异常.

我所说的'链'是指如下:

我有一个客户端脚本,它使用xmlrpclib来调用SimpleXMLRPCServer上的函数.反过来,该服务器调用另一个SimpleXMLRPCServer.我意识到这听起来有多复杂,但是有充分的理由选择了这种架构,我没有看到它不应该成为可能的原因.

(testclient)client_script ---calls--> 
    (middleserver)SimpleXMLRPCServer ---calls---> 
        (finalserver)SimpleXMLRPCServer --- does something

Run Code Online (Sandbox Code Playgroud)

如果我不使用SocketServer.ThreadingMixin然后这个问题不会发生(但我需要多线程的请求,所以这没有帮助.)
如果我只有一个级别的服务(即只是客户端脚本直接调用最终服务器),这不会发生.

我已经能够在下面的简单测试代码中重现该问题.有三个片段:

finalserver:

import SocketServer
import time
from SimpleXMLRPCServer import SimpleXMLRPCServer
from SimpleXMLRPCServer import SimpleXMLRPCRequestHandler

class AsyncXMLRPCServer(SocketServer.ThreadingMixIn,SimpleXMLRPCServer): pass

# Create server
server = AsyncXMLRPCServer(('', 9999), SimpleXMLRPCRequestHandler)
server.register_introspection_functions()

def waste_time():
    time.sleep(10)
    return True

server.register_function(waste_time, 'waste_time')
server.serve_forever()

Run Code Online (Sandbox Code Playgroud)

middleserver:

import SocketServer
from SimpleXMLRPCServer import SimpleXMLRPCServer
from SimpleXMLRPCServer import SimpleXMLRPCRequestHandler
import xmlrpclib

class AsyncXMLRPCServer(SocketServer.ThreadingMixIn,SimpleXMLRPCServer): pass

# Create server
server = AsyncXMLRPCServer(('', 8888), SimpleXMLRPCRequestHandler)
server.register_introspection_functions()

s = xmlrpclib.ServerProxy('http://localhost:9999')
def call_waste():
    s.waste_time()
    return True …

Run Code Online (Sandbox Code Playgroud)

python simplexmlrpcserver xmlrpclib httplib python-2.7

8
推荐指数

1
解决办法

5452
查看次数

python httplib名称或服务未知

我正在尝试使用httplib将信用卡信息发送到authorize.net.当我尝试发布请求时,我得到以下回溯:

File "./lib/cgi_app.py", line 139, in run res = method()
File "/var/www/html/index.py", line 113, in ProcessRegistration conn.request("POST", "/gateway/transact.dll", mystring, headers)
File "/usr/local/lib/python2.7/httplib.py", line 946, in request self._send_request(method, url, body, headers)
File "/usr/local/lib/python2.7/httplib.py", line 987, in _send_request self.endheaders(body)
File "/usr/local/lib/python2.7/httplib.py", line 940, in endheaders self._send_output(message_body)
File "/usr/local/lib/python2.7/httplib.py", line 803, in _send_output self.send(msg)
File "/usr/local/lib/python2.7/httplib.py", line 755, in send self.connect()
File "/usr/local/lib/python2.7/httplib.py", line 1152, in connect self.timeout, self.source_address)
File "/usr/local/lib/python2.7/socket.py", line 567, in create_connection raise error, msg
gaierror: [Errno -2] Name or …

Run Code Online (Sandbox Code Playgroud)

python ssl httplib

7
推荐指数

3
解决办法

4万
查看次数

使用Python/Django从Facebook回复"缺少redirect_uri参数"

这可能是一个非常愚蠢的问题,但我一直盯着这几个小时,却找不到我做错了什么.

我正在尝试使用Python通过Facebook API进行身份验证,但是在请求用户访问令牌时遇到问题.收到代码后,我向https://graph.facebook.com/oauth/access_token发出请求,如下:

conn = httplib.HTTPSConnection("graph.facebook.com")
params = urllib.urlencode({'redirect_uri':request.build_absolute_uri(reverse('some_app.views.home')),
                           'client_id':apis.Facebook.app_id,
                           'client_secret':apis.Facebook.app_secret,
                           'code':code})
conn.request("GET", "/oauth/access_token", params)
response = conn.getresponse()
response_body = response.read()

Run Code Online (Sandbox Code Playgroud)

作为回应,我收到了

{"error":{"message":"缺少redirect_uri参数.","type":"OAuthException","code":191}}

什么想法可能会出错？我已经验证了正在传递的redirect_uri是在应用程序域上,但这可能是一个问题,这是在本地托管,并且该域只是由我的hosts文件重定向到localhost？

谢谢你的帮助!

编辑:

我使用请求库得到了这个:

params = {'redirect_uri':request.build_absolute_uri(reverse('profiles.views.fb_signup')),
                           'client_id':apis.Facebook.app_id,
                           'client_secret':apis.Facebook.app_secret,
                           'code':code}

r = requests.get("https://graph.facebook.com/oauth/access_token",params=params)

Run Code Online (Sandbox Code Playgroud)

但是,我仍然希望依赖于库,这应该在没有太多困难的情况下原生支持.也许这要求太多了......

python ssl facebook http httplib

7
推荐指数

1
解决办法

1万
查看次数

处理IncompleteRead,URLError

它是一个Web挖掘脚本.

def printer(q,missing):
    while 1:
        tmpurl=q.get()
        try:
            image=urllib2.urlopen(tmpurl).read()
        except httplib.HTTPException:
            missing.put(tmpurl)
            continue
        wf=open(tmpurl[-35:]+".jpg","wb")
        wf.write(image)
        wf.close()

Run Code Online (Sandbox Code Playgroud)

q是一个Queue()由Urls组成的``缺少一个空队列来收集错误提升网址

它由10个线程并行运行.

每次我跑这个,我得到了这个.

  File "C:\Python27\lib\socket.py", line 351, in read
    data = self._sock.recv(rbufsize)
  File "C:\Python27\lib\httplib.py", line 541, in read
    return self._read_chunked(amt)
  File "C:\Python27\lib\httplib.py", line 592, in _read_chunked
    value.append(self._safe_read(amt))
  File "C:\Python27\lib\httplib.py", line 649, in _safe_read
    raise IncompleteRead(''.join(s), amt)
IncompleteRead: IncompleteRead(5274 bytes read, 2918 more expected)

Run Code Online (Sandbox Code Playgroud)

但我确实使用了except......我尝试过其他类似的东西

httplib.IncompleteRead
urllib2.URLError

Run Code Online (Sandbox Code Playgroud)

甚至,

image=urllib2.urlopen(tmpurl,timeout=999999).read()

Run Code Online (Sandbox Code Playgroud)

但这都不起作用..

我怎么能抓住IncompleteRead和URLError？

python error-handling urllib2 httplib

7
推荐指数

1
解决办法

7012
查看次数

如何在 Python 中发布分块的编码数据

我正在尝试将分块编码的数据发布到 httpbin.org/post。我尝试了两个选项：Requests 和 httplib

使用请求

#!/usr/bin/env python

import requests

def gen():
        l = range(130)
        for i in l:
                yield '%d' % i

if __name__ == "__main__":
        url = 'http://httpbin.org/post'
        headers = {
                        'Transfer-encoding':'chunked',
                        'Cache-Control': 'no-cache',
                        'Connection': 'Keep-Alive',
                        #'User-Agent': 'ExpressionEncoder'
                }
        r = requests.post(url, headers = headers, data = gen())
        print r

Run Code Online (Sandbox Code Playgroud)

使用 httplib

#!/usr/bin/env python

import httplib
import os.path

if __name__ == "__main__":
        conn = httplib.HTTPConnection('httpbin.org')
        conn.connect()
        conn.putrequest('POST', '/post')
        conn.putheader('Transfer-Encoding', 'chunked')
        conn.putheader('Connection', 'Keep-Alive')
        conn.putheader('Cache-Control', 'no-cache')
        conn.endheaders()
        for i in range(130): …

Run Code Online (Sandbox Code Playgroud)

python http httplib chunked-encoding python-requests

7
推荐指数

1
解决办法

1万
查看次数

标签统计

http ×2

python-requests ×2

ssl ×2

chunked-encoding ×1

error-handling ×1

selenium-webdriver ×1

simplexmlrpcserver ×1

«
1
2
3
4
5
…
7
»