如何在Python中实现优先级队列？

Question

如何在Python中实现优先级队列？

cod*_*ark 23 python heap queue priority-queue

很抱歉这样一个愚蠢的问题,但Python文档令人困惑.

链接1:队列实施 http://docs.python.org/library/queue.html

它说那个队列有一个优先队列的构造.但我找不到如何实现它.

class Queue.PriorityQueue(maxsize=0)

Run Code Online (Sandbox Code Playgroud)

链接2:堆实现 http://docs.python.org/library/heapq.html

在这里,他们说我们可以使用heapq间接实现优先级队列

pq = []                         # list of entries arranged in a heap
entry_finder = {}               # mapping of tasks to entries
REMOVED = '<removed-task>'      # placeholder for a removed task
counter = itertools.count()     # unique sequence count

def add_task(task, priority=0):
    'Add a new task or update the priority of an existing task'
    if task in entry_finder:
        remove_task(task)
    count = next(counter)
    entry = [priority, count, task]
    entry_finder[task] = entry
    heappush(pq, entry)

def remove_task(task):
    'Mark an existing task as REMOVED.  Raise KeyError if not found.'
    entry = entry_finder.pop(task)
    entry[-1] = REMOVED

def pop_task():
    'Remove and return the lowest priority task. Raise KeyError if empty.'
    while pq:
        priority, count, task = heappop(pq)
        if task is not REMOVED:
            del entry_finder[task]
            return task
    raise KeyError('pop from an empty priority queue'

Run Code Online (Sandbox Code Playgroud)

哪个是python中最有效的优先级队列实现？以及如何实现它？

Answer 1

nin*_*cko 29

在任何语言中都没有"最有效的优先级队列实现" .

优先级队列都是权衡取舍.见http://en.wikipedia.org/wiki/Priority_queue

您应该根据计划使用它的方式从这两个中选择一个:

O(log(N))插入时间和O(1)findMin + deleteMin时间,或
O(1)插入时间和O(log(N))findMin + deleteMin时间

在后一种情况下,您可以选择使用Fibonacci堆实现优先级队列:http://en.wikipedia.org/wiki/Heap_(data_structure)#Comparison_of_theoretic_bounds_for_variants(如您所见,heapq基本上是二叉树,必须必须O(log(N))同时插入和findMin + deleteMin)

如果您正在处理具有特殊属性的数据(例如有界数据),那么您可以实现O(1)插入和O(1)findMin + deleteMin时间.您只能对某些类型的数据执行此操作,否则您可能会滥用您的优先级队列以违反O(N log(N))排序限制.

要以任何语言实现任何队列,您只需要定义insert(value)和extractMin() -> value操作.这通常只涉及底层堆的最小包装; 请参阅http://en.wikipedia.org/wiki/Fibonacci_heap以实现您自己的,或使用类似堆的现成库,如配对堆(Google搜索显示http://svn.python.org /projects/sandbox/trunk/collections/pairing_heap.py)

如果您只关心您引用的两个中哪一个更有效(heapq来自http://docs.python.org/library/heapq.html#priority-queue-implementation-notes的基于上面的代码,相比之下Queue.PriorityQueue) , 然后:

在网上似乎没有任何关于Queue.PriorityQueue实际做什么的容易找到的讨论; 您需要深入了解代码,该代码链接到帮助文档:http://hg.python.org/cpython/file/2.7/Lib/Queue.py

   224     def _put(self, item, heappush=heapq.heappush):
   225         heappush(self.queue, item)
   226 
   227     def _get(self, heappop=heapq.heappop):
   228         return heappop(self.queue)

Run Code Online (Sandbox Code Playgroud)

我们可以看到,Queue.PriorityQueue它也被heapq用作底层机制.因此它们同样糟糕(渐近地说).Queue.PriorityQueue可能允许并行查询,所以我打赌它可能有一个非常轻微的常数因素更多的开销.但是因为你知道底层实现(和渐近行为)必须相同,最简单的方法就是在同一个大型数据集上运行它们.

(请注意,Queue.PriorityQueue似乎没有办法删除条目,但heapq确实如此.但这是一把双刃剑:优先级优先级队列实现可能允许您删除O(1)或O(log(N)中的元素)时间,但是如果你使用remove_task你提到的功能,并让那些僵尸任务在你的队列中累积,因为你没有从最小的时候提取它们,那么你会看到渐渐减速,否则你将看不到.当然,你首先不能这样做Queue.PriorityQueue,所以不能在这里进行比较.)

Answer 2

Ray*_*ger 28

Queue模块中的版本是使用heapq模块实现的,因此它们对底层堆操作具有相同的效率.

也就是说,Queue版本较慢,因为它增加了锁,封装和一个很好的面向对象的API.

heapq文档中显示的优先级队列建议旨在说明如何向优先级队列添加其他功能(例如排序稳定性和更改先前排队任务的优先级的能力).如果您不需要这些功能,那么基本的heappush和heappop功能将为您提供最快的性能.

归档时间：	13 年，10 月前
查看次数：	39783 次
最近记录：	7 年，2 月前