在 python 中同时运行多个线程 - 有可能吗？

Question

在 python 中同时运行多个线程 - 有可能吗？

YSY*_*YSY 6 python multithreading web-crawler gil

我正在编写一个应该多次获取 URL 的小爬虫，我希望所有线程同时（同时）运行。

我写了一小段代码应该可以做到这一点。

import thread
from urllib2 import Request, urlopen, URLError, HTTPError


def getPAGE(FetchAddress):
    attempts = 0
    while attempts < 2:
        req = Request(FetchAddress, None)
        try:
            response = urlopen(req, timeout = 8) #fetching the url
            print "fetched url %s" % FetchAddress
        except HTTPError, e:
            print 'The server didn\'t do the request.'
            print 'Error code: ', str(e.code) + "  address: " + FetchAddress
            time.sleep(4)
            attempts += 1
        except URLError, e:
            print 'Failed to reach the server.'
            print 'Reason: ', str(e.reason) + "  address: " + FetchAddress
            time.sleep(4)
            attempts += 1
        except Exception, e:
            print 'Something bad happened in gatPAGE.'
            print 'Reason: ', str(e.reason) + "  address: " + FetchAddress
            time.sleep(4)
            attempts += 1
        else:
            try:
                return response.read()
            except:
                "there was an error with response.read()"
                return None
    return None

url = ("http://www.domain.com",)

for i in range(1,50):
    thread.start_new_thread(getPAGE, url)

Run Code Online (Sandbox Code Playgroud)

从 apache 日志看来，线程似乎不是同时运行，请求之间有一点差距，几乎检测不到，但我可以看到线程并不是真正并行的。

我读过 GIL，有没有办法绕过它而不调用 C\C++ 代码？我真的不明白 GIL 是如何实现线程化的？python 基本上在完成前一个线程后立即解释下一个线程？

谢谢。

Answer 1

NPE*_*NPE 6

正如您所指出的，GIL 通常会阻止 Python 线程并行运行。

然而，情况并非总是如此。一个例外是 I/O 绑定代码。当线程正在等待 I/O 请求完成时，它通常会在进入等待之前释放 GIL。这意味着其他线程可以在此期间取得进展。

然而，一般来说，multiprocessing当需要真正的并行性时，这是更安全的选择。

归档时间：	14 年，4 月前
查看次数：	9805 次
最近记录：	13 年，3 月前