按索引访问collections.OrderedDict中的项目

Bil*_*ljk 130 python collections dictionary ordereddictionary python-3.x

可以说我有以下代码:

import collections
d = collections.OrderedDict()
d['foo'] = 'python'
d['bar'] = 'spam'
Run Code Online (Sandbox Code Playgroud)

有没有办法以编号的方式访问这些项目,例如:

d(0) #foo's Output
d(1) #bar's Output
Run Code Online (Sandbox Code Playgroud)

Abh*_*jit 162

如果是a,OrderedDict()您可以通过索引来轻松访问元素,方法是获取(键,值)对的元组,如下所示

>>> import collections
>>> d = collections.OrderedDict()
>>> d['foo'] = 'python'
>>> d['bar'] = 'spam'
>>> d.items()
[('foo', 'python'), ('bar', 'spam')]
>>> d.items()[0]
('foo', 'python')
>>> d.items()[1]
('bar', 'spam')
Run Code Online (Sandbox Code Playgroud)

Python 3.X的注释

dict.items将返回一个可迭代的dict视图对象而不是列表.我们需要将调用包装到列表中以使索引成为可能

>>> items = list(d.items())
>>> items
[('foo', 'python'), ('bar', 'spam')]
>>> items[0]
('foo', 'python')
>>> items[1]
('bar', 'spam')
Run Code Online (Sandbox Code Playgroud)

  • 请注意,在3.x中,`items`方法返回一个可交换的字典视图对象而不是列表,并且不支持切片或索引.所以你必须先把它变成一个列表.http://docs.python.org/3.3/library/stdtypes.html#dict-views (18认同)
  • 对于大型的dictonaries,将项目,值或键复制到列表中可能会非常慢.我创建了一个OrderedDict()的重写,它具有不同的内部数据结构,适用于必须经常执行此操作的应用程序:https://github.com/niklasf/indexed.py (8认同)
  • 如果你只访问一个项目,你可以通过使用`next(islice(d.items(),1))`来避免`list(d.items())`的内存开销来获得`('bar','垃圾邮件")` (7认同)

Gra*_*ntJ 23

你是否必须使用OrderedDict,或者你是否特别想要一种类似地图的类型,它以某种方式使用快速位置索引进行排序?如果是后者,则考虑Python的许多已排序的dict类型之一(根据键排序顺序对键值对进行排序).某些实现还支持快速索引.例如,sortedcontainers项目就具有SortedDict类型.

>>> from sortedcontainers import SortedDict
>>> sd = SortedDict()
>>> sd['foo'] = 'python'
>>> sd['bar'] = 'spam'
>>> print sd.iloc[0] # Note that 'bar' comes before 'foo' in sort order.
'bar'
>>> # If you want the value, then simple do a key lookup:
>>> print sd[sd.iloc[1]]
'python'
Run Code Online (Sandbox Code Playgroud)


Ste*_*ate 18

如果您想在OrderedDict中第一个条目(或接近它)而不创建列表,这是一个特例:

>>> from collections import OrderedDict
>>> 
>>> d = OrderedDict()
>>> d["foo"] = "one"
>>> d["bar"] = "two"
>>> d["baz"] = "three"
>>> 
>>> d.iteritems().next()
('foo', 'one')
Run Code Online (Sandbox Code Playgroud)

(第一次你说"next()",它真的意味着"第一次.")

在Python 2.7的非正式测试中,iteritems().next()使用一个小的OrderedDict只比它快一点点items()[0].使用10,000个条目的OrderedDict,iteritems().next()比约快200倍items()[0].

但是,如果您保存items()列表一次然后使用该列表,那可能会更快.或者,如果您反复{创建一个iteritems()迭代器并逐步执行它到您想要的位置},那可能会更慢.

  • Python 3`OrderDict`s没有`iteritems()`方法,因此你需要执行以下操作才能获得第一项:`next(iter(d.items()))`. (9认同)

刘金国*_*刘金国 14

从包中使用IndexedOrderedDict显着提高效率indexed.

根据Niklas的评论,我在OrderedDictIndexedOrderedDict上做了1000个条目的基准测试.

In [1]: from numpy import *
In [2]: from indexed import IndexedOrderedDict
In [3]: id=IndexedOrderedDict(zip(arange(1000),random.random(1000)))
In [4]: timeit id.keys()[56]
1000000 loops, best of 3: 969 ns per loop

In [8]: from collections import OrderedDict
In [9]: od=OrderedDict(zip(arange(1000),random.random(1000)))
In [10]: timeit od.keys()[56]
10000 loops, best of 3: 104 µs per loop
Run Code Online (Sandbox Code Playgroud)

在这种特定情况下,IndexedOrderedDict在特定位置的索引元素中快约100倍.


Qua*_*um7 8

此社区wiki尝试收集现有答案.

Python 2.7

在Python 2中,keys(),values(),和items()函数OrderedDict的返回列表.以values最简单的方式为例

d.values()[0]  # "python"
d.values()[1]  # "spam"
Run Code Online (Sandbox Code Playgroud)

对于只关注单个索引的大型集合,可以避免使用生成器版本创建完整列表iterkeys,itervalues并且iteritems:

import itertools
next(itertools.islice(d.itervalues(), 0, 1))  # "python"
next(itertools.islice(d.itervalues(), 1, 2))  # "spam"
Run Code Online (Sandbox Code Playgroud)

indexed.py包提供IndexedOrderedDict,这是专为这种使用情况下,将是最快的选项.

from indexed import IndexedOrderedDict
d = IndexedOrderedDict({'foo':'python','bar':'spam'})
d.values()[0]  # "python"
d.values()[1]  # "spam"
Run Code Online (Sandbox Code Playgroud)

对于具有随机访问的大型词典,使用itervalues可以快得多:

$ python2 -m timeit -s 'from collections import OrderedDict; from random import randint; size = 1000;   d = OrderedDict({i:i for i in range(size)})'  'i = randint(0, size-1); d.values()[i:i+1]'
1000 loops, best of 3: 259 usec per loop
$ python2 -m timeit -s 'from collections import OrderedDict; from random import randint; size = 10000;  d = OrderedDict({i:i for i in range(size)})' 'i = randint(0, size-1); d.values()[i:i+1]'
100 loops, best of 3: 2.3 msec per loop
$ python2 -m timeit -s 'from collections import OrderedDict; from random import randint; size = 100000; d = OrderedDict({i:i for i in range(size)})' 'i = randint(0, size-1); d.values()[i:i+1]'
10 loops, best of 3: 24.5 msec per loop

$ python2 -m timeit -s 'from collections import OrderedDict; from random import randint; size = 1000;   d = OrderedDict({i:i for i in range(size)})' 'i = randint(0, size-1); next(itertools.islice(d.itervalues(), i, i+1))'
10000 loops, best of 3: 118 usec per loop
$ python2 -m timeit -s 'from collections import OrderedDict; from random import randint; size = 10000;  d = OrderedDict({i:i for i in range(size)})' 'i = randint(0, size-1); next(itertools.islice(d.itervalues(), i, i+1))'
1000 loops, best of 3: 1.26 msec per loop
$ python2 -m timeit -s 'from collections import OrderedDict; from random import randint; size = 100000; d = OrderedDict({i:i for i in range(size)})' 'i = randint(0, size-1); next(itertools.islice(d.itervalues(), i, i+1))'
100 loops, best of 3: 10.9 msec per loop

$ python2 -m timeit -s 'from indexed import IndexedOrderedDict; from random import randint; size = 1000;   d = IndexedOrderedDict({i:i for i in range(size)})' 'i = randint(0, size-1); d.values()[i]'
100000 loops, best of 3: 2.19 usec per loop
$ python2 -m timeit -s 'from indexed import IndexedOrderedDict; from random import randint; size = 10000;  d = IndexedOrderedDict({i:i for i in range(size)})' 'i = randint(0, size-1); d.values()[i]'
100000 loops, best of 3: 2.24 usec per loop
$ python2 -m timeit -s 'from indexed import IndexedOrderedDict; from random import randint; size = 100000; d = IndexedOrderedDict({i:i for i in range(size)})' 'i = randint(0, size-1); d.values()[i]'
100000 loops, best of 3: 2.61 usec per loop

+--------+-----------+----------------+---------+
|  size  | list (ms) | generator (ms) | indexed |
+--------+-----------+----------------+---------+
|   1000 | .259      | .118           | .00219  |
|  10000 | 2.3       | 1.26           | .00224  |
| 100000 | 24.5      | 10.9           | .00261  |
+--------+-----------+----------------+---------+
Run Code Online (Sandbox Code Playgroud)

Python 3.6

Python 3具有相同的两个基本选项(列表与生成器),但dict方法默认返回生成器.

列表方法:

list(d.values())[0]  # "python"
list(d.values())[1]  # "spam"
Run Code Online (Sandbox Code Playgroud)

发电机方式:

import itertools
next(itertools.islice(d.values(), 0, 1))  # "python"
next(itertools.islice(d.values(), 1, 2))  # "spam"
Run Code Online (Sandbox Code Playgroud)

Python 3字典比python 2快一个数量级,并且使用生成器具有类似的加速.

+--------+-----------+----------------+---------+
|  size  | list (ms) | generator (ms) | indexed |
+--------+-----------+----------------+---------+
|   1000 | .0316     | .0165          | .00262  |
|  10000 | .288      | .166           | .00294  |
| 100000 | 3.53      | 1.48           | .00332  |
+--------+-----------+----------------+---------+
Run Code Online (Sandbox Code Playgroud)


hig*_*ost 5

这是一个新时代,Python 3.6.1词典现在可以保留其顺序。这些语义并不明确,因为这需要获得BDFL的批准。但是雷蒙德·海廷格(Raymond Hettinger)是下一个最好的东西(而且更有趣),他提出了一个非常有力的理由,那就是字典将被订购很长一段时间。

因此,现在很容易创建字典的切片:

test_dict = {
                'first':  1,
                'second': 2,
                'third':  3,
                'fourth': 4
            }

list(test_dict.items())[:2]
Run Code Online (Sandbox Code Playgroud)

注意:现在,字典插入顺序保留在Python 3.7中正式的