Che*_* A. 3 python multithreading multiprocessing python-2.7
为什么下面的代码在使用时运行threads但在使用时抛出异常multiprocessing?
from multiprocessing import Pool
from multiprocessing.dummy import Pool as ThreadsPool
import urllib2
urls = [
'http://www.python.org',
'http://www.python.org/about/',
'http://www.python.org/doc/',
'http://www.python.org/download/']
def use_threads():
pool = ThreadsPool(4)
results = pool.map(urllib2.urlopen, urls)
pool.close()
pool.join()
print [len(x.read()) for x in results]
def use_procs():
p_pool = Pool(4)
p_results = p_pool.map(urllib2.urlopen, urls)
p_pool.close()
p_pool.join()
print 'using procs instead of threads'
print [len(x.read()) for x in p_results]
if __name__ == '__main__':
use_procs()
Run Code Online (Sandbox Code Playgroud)
例外是
from multiprocessing import Pool
from multiprocessing.dummy import Pool as ThreadsPool
import urllib2
urls = [
'http://www.python.org',
'http://www.python.org/about/',
'http://www.python.org/doc/',
'http://www.python.org/download/']
def use_threads():
pool = ThreadsPool(4)
results = pool.map(urllib2.urlopen, urls)
pool.close()
pool.join()
print [len(x.read()) for x in results]
def use_procs():
p_pool = Pool(4)
p_results = p_pool.map(urllib2.urlopen, urls)
p_pool.close()
p_pool.join()
print 'using procs instead of threads'
print [len(x.read()) for x in p_results]
if __name__ == '__main__':
use_procs()
Run Code Online (Sandbox Code Playgroud)
我知道进程和线程如何相互通信是有区别的。为什么pickle网站内容失败?如何设置编码来解决这个问题?
问题不是编码错误,而是由于酸洗错误,因为结果urllib2.urlopen()返回的是一个不可酸洗的对象(_ssl._SSLSocket根据我在您的代码中收到的错误消息中显示的略有不同的原因)。为了解决这个问题,您可以通过在打开 url 后读取数据来将返回对象的使用限制为子进程本身,如下所示。然而,这可能意味着需要在进程之间传递更多数据。
# Added.
def get_data(url):
soc = urllib2.urlopen(url)
return soc.read()
def use_procs():
p_pool = Pool(4)
# p_results = p_pool.map(urllib2.urlopen, urls)
p_results = p_pool.map(get_data, urls)
p_pool.close()
p_pool.join()
print 'using procs instead of threads'
# print [len(x.read()) for x in results]
print [len(x) for x in p_results]
Run Code Online (Sandbox Code Playgroud)
输出:
# Added.
def get_data(url):
soc = urllib2.urlopen(url)
return soc.read()
def use_procs():
p_pool = Pool(4)
# p_results = p_pool.map(urllib2.urlopen, urls)
p_results = p_pool.map(get_data, urls)
p_pool.close()
p_pool.join()
print 'using procs instead of threads'
# print [len(x.read()) for x in results]
print [len(x) for x in p_results]
Run Code Online (Sandbox Code Playgroud)