use*_*455 1 python multithreading
我在列表中有一个很大的数据集,需要做一些工作。
我想在任意给定时间启动x数量的线程以在列表上工作,直到弹出该列表中的所有内容为止。
我知道如何在给定的时间(通过使用thread1 .... thread20.start())启动x数量的线程(说20个)
但是当前20个线程之一完成时,如何使它启动一个新线程?因此在任何给定时间,有20个线程在运行,直到列表为空。
我到目前为止所拥有的:
class queryData(threading.Thread):
def __init__(self,threadID):
threading.Thread.__init__(self)
self.threadID = threadID
def run(self):
global lst
#Get trade from list
trade = lst.pop()
tradeId=trade[0][1][:6]
print tradeId
thread1 = queryData(1)
thread1.start()
Run Code Online (Sandbox Code Playgroud)
更新资料
我的代码如下:
for i in range(20):
threads.append(queryData(i))
for thread in threads:
thread.start()
while len(lst)>0:
for iter,thread in enumerate(threads):
thread.join()
lock.acquire()
threads[iter] = queryData(i)
threads[iter].start()
lock.release()
Run Code Online (Sandbox Code Playgroud)
现在它从头开始启动20个线程...然后在一个线程结束时继续启动一个新线程。
但是,它效率不高,因为它等待列表中的第一个完成,然后再等待第二个..依此类推。
有更好的方法吗?
基本上我需要:
-Start 20 threads:
-While list is not empty:
-wait for 1 of the 20 threads to finish
-reuse or start a new thread
Run Code Online (Sandbox Code Playgroud)
As I suggested in a comment, I think using a multiprocessing.pool.ThreadPool would be appropriate — because it would handle much of the thread management you're manually doing in your code automatically. Once all the threads are queued-up for processing via ThreadPool's apply_async() method calls, the only thing that needs to be done is wait until they've all finished execution (unless there's something else your code could be doing, of course).
I've translated the code in my linked answer to another related question so it's more similar to what you appear to be doing to make it easier to understand in the current context.
from multiprocessing.pool import ThreadPool
from random import randint
import threading
import time
MAX_THREADS = 5
print_lock = threading.Lock() # Prevent overlapped printing from threads.
def query_data(trade):
trade_id = trade[0][1][:6]
time.sleep(randint(1, 3)) # Simulate variable working time for testing.
with print_lock:
print(trade_id)
def process_trades(trade_list):
pool = ThreadPool(processes=MAX_THREADS)
results = []
while(trade_list):
trade = trade_list.pop()
results.append(pool.apply_async(query_data, (trade,)))
pool.close() # Done adding tasks.
pool.join() # Wait for all tasks to complete.
def test():
trade_list = [[['abc', ('%06d' % id) + 'defghi']] for id in range(1, 101)]
process_trades(trade_list)
if __name__ == "__main__":
test()
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
1741 次 |
| 最近记录: |