Jo *_* Ko 5 python multithreading http anaconda splinter
使用splinter和Python,我有两个正在运行的线程,每个线程都访问相同的主URL但使用不同的路由,例如,线程一hits:mainurl.com/threadone和线程两次hits mainurl.com/threadtwo使用:
from splinter import Browser
browser = Browser('chrome')
Run Code Online (Sandbox Code Playgroud)
但是遇到了以下错误:
Traceback (most recent call last):
File "multi_thread_practice.py", line 299, in <module>
main()
File "multi_thread_practice.py", line 290, in main
first_method(r)
File "multi_thread_practice.py", line 195, in parser
second_method(title, name)
File "multi_thread_practice.py", line 208, in confirm_product
third_method(current_url)
File "multi_thread_practice.py", line 214, in buy_product
browser.visit(url)
File "/Users/joshua/anaconda/lib/python2.7/site-packages/splinter/driver/webdriver/__init__.py", line 184, in visit
self.driver.get(url)
File "/Users/joshua/anaconda/lib/python2.7/site-packages/selenium/webdriver/remote/webdriver.py", line 261, in get
self.execute(Command.GET, {'url': url})
File "/Users/joshua/anaconda/lib/python2.7/site-packages/selenium/webdriver/remote/webdriver.py", line 247, in execute
response = self.command_executor.execute(driver_command, params)
File "/Users/joshua/anaconda/lib/python2.7/site-packages/selenium/webdriver/remote/remote_connection.py", line 464, in execute
return self._request(command_info[0], url, body=data)
File "/Users/joshua/anaconda/lib/python2.7/site-packages/selenium/webdriver/remote/remote_connection.py", line 488, in _request
resp = self._conn.getresponse()
File "/Users/joshua/anaconda/lib/python2.7/httplib.py", line 1108, in getresponse
raise ResponseNotReady()
httplib.ResponseNotReady
Run Code Online (Sandbox Code Playgroud)
有什么错误,我应该如何处理?
预先谢谢您,一定会赞成/接受答案
新增代码
import time
from splinter import Browser
import threading
browser = Browser('chrome')
start_time = time.time()
urlOne = 'http://www.practiceurl.com/one'
urlTwo = 'http://www.practiceurl.com/two'
baseUrl = 'http://practiceurl.com'
browser.visit(baseUrl)
def secondThread(url):
print 'STARTING 2ND REQUEST: ' + str(time.time() - start_time)
browser.visit(url)
print 'END 2ND REQUEST: ' + str(time.time() - start_time)
def mainThread(url):
print 'STARTING 1ST REQUEST: ' + str(time.time() - start_time)
browser.visit(url)
print 'END 1ST REQUEST: ' + str(time.time() - start_time)
def main():
threadObj = threading.Thread(target=secondThread, args=[urlTwo])
threadObj.daemon = True
threadObj.start()
mainThread(urlOne)
main()
Run Code Online (Sandbox Code Playgroud)
小智 2
据我所知,您想要做的事情在一种浏览器上是不可能的。Splinter 作用于实际的浏览器,因此同时传递许多命令会导致问题。它的行为就像人类与浏览器交互一样(当然是自动化的)。可以打开许多浏览器窗口,但是您无法在没有收到前一个请求的响应的情况下在不同的线程中发送请求。这会导致 CannotSendRequest 错误。因此,我建议(如果您需要使用线程)打开两个浏览器,然后使用线程通过每个浏览器发送请求。否则的话,是做不到的。
该线程基于selenium,但信息是可传输的。Selenium 同时使用多个选项卡 再次强调,这说明你想要(我认为)做的事情是不可能的。绿色打勾的答案提供者也提出了与我相同的建议。
希望这不会让你偏离轨道太多,并能帮助你。
编辑:只是为了展示:
import time
from splinter import Browser
import threading
browser = Browser('firefox')
browser2 = Browser('firefox')
start_time = time.time()
urlOne = 'http://www.practiceurl.com/one'
urlTwo = 'http://www.practiceurl.com/two'
baseUrl = 'http://practiceurl.com'
browser.visit(baseUrl)
def secondThread(url):
print 'STARTING 2ND REQUEST: ' + str(time.time() - start_time)
browser2.visit(url)
print 'END 2ND REQUEST: ' + str(time.time() - start_time)
def mainThread(url):
print 'STARTING 1ST REQUEST: ' + str(time.time() - start_time)
browser.visit(url)
print 'END 1ST REQUEST: ' + str(time.time() - start_time)
def main():
threadObj = threading.Thread(target=secondThread, args=[urlTwo])
threadObj.daemon = True
threadObj.start()
mainThread(urlOne)
main()
Run Code Online (Sandbox Code Playgroud)
请注意,我使用的是 Firefox,因为我没有安装 chromedriver。
在浏览器打开后设置等待可能是一个好主意,只是为了确保它们在计时器开始之前已完全准备好。