std*_*ave 2 python performance dictionary
这不是过早的优化。我的用例在内部循环的最内部对 dict 的权限进行了双重检查,一直在运行。此外,它在智力上令人厌烦(见结果)。
这些方法中哪个更快?
mydict = { 'hello': 'yes', 'goodbye': 'no' }
key = 'hello'
# (A)
if key in mydict:
a = mydict[key]
do_things(a)
else:
handle_an_error()
# vs (B)
a = mydict.get(key,None)
if a is not None:
do_things(a)
else:
handle_an_error()
Run Code Online (Sandbox Code Playgroud)
编辑:这些速度相同。常识告诉我 (B) 应该明显更快,因为它只是 1 次 dict 查找与 2 次,但结果不同。我在挠头。
基准测试的结果平均超过 12 次运行,其中 1/2 是命中,另一半是未命中:
doing in
switching to get
total time for IN: 0.532250006994
total time for GET: 0.480916659037
times found: 12000000
times not found: 12000000
Run Code Online (Sandbox Code Playgroud)
当一个类似的运行(*10 多个循环)没有找到密钥时,
doing in
switching to get
total time for IN: 2.35899998744
total time for GET: 4.13858334223
Run Code Online (Sandbox Code Playgroud)
为什么!?
(正确)代码
import time
smalldict = {}
for i in range(10):
smalldict[str(i*4)] = str(i*18)
smalldict["8"] = "hello"
bigdict = {}
for i in range(10000):
bigdict[str(i*100)] = str(i*4123)
bigdict["hello"] = "yes!"
timetotal = 0
totalin = 0
totalget = 0
key = "hello"
found= 0
notfound = 0
ddo = bigdict # change to smalldict for small dict gets
print 'doing in'
for r in range(12):
start = time.time()
a = r % 2
for i in range(1000000):
if a == 0:
if str(key) in ddo:
found = found + 1
foo = ddo[str(key)]
else:
notfound = notfound + 1
foo = "nooo"
else:
if 'yo' in ddo:
found = found + 1
foo = ddo['yo']
else:
notfound = notfound + 1
foo = "nooo"
timetotal = timetotal + (time.time() - start)
totalin = timetotal / 12.0
print 'switching to get'
timetotal = 0
for r in range(12):
start = time.time()
a = r % 2
for i in range(1000000):
if a == 0:
foo = ddo.get(key,None)
if foo is not None:
found = found + 1
else:
notfound = notfound + 1
foo = "nooo"
else:
foo = ddo.get('yo',None)
if foo is not None:
found = found + 1
notfound = notfound + 1
else:
notfound = notfound + 1
foo = "oooo"
timetotal = timetotal + (time.time() - start)
totalget = timetotal / 12
print "total time for IN: ", totalin
print 'total time for GET: ', totalget
print 'times found:', found
print 'times not found:', notfound
Run Code Online (Sandbox Code Playgroud)
(原)代码导入时间 smalldict = {} for i in range(10): smalldict[str(i*4)] = str(i*18)
smalldict["8"] = "hello"
bigdict = {}
for i in range(10000):
bigdict[str(i*100)] = str(i*4123)
bigdict["8000"] = "hello"
timetotal = 0
totalin = 0
totalget = 0
key = "hello"
found= 0
notfound = 0
ddo = bigdict # change to smalldict for small dict gets
print 'doing in'
for r in range(12):
start = time.time()
a = r % 2
for i in range(10000000):
if a == 0:
if key in ddo:
foo = ddo[key]
else:
foo = "nooo"
else:
if 'yo' in ddo:
foo = ddo['yo']
else:
foo = "nooo"
timetotal = timetotal + (time.time() - start)
totalin = timetotal / 12.0
print 'switching to get'
timetotal = 0
for r in range(12):
start = time.time()
a = r % 2
for i in range(10000000):
if a == 0:
foo = ddo.get(key,None)
if foo is not None:
# yaaay
pass
else:
foo = "nooo"
else:
foo = ddo.get('yo',None)
if foo is not None:
#yaaay
pass
else:
foo = "oooo"
timetotal = timetotal + (time.time() - start)
totalget = timetotal / 12
print "total time for IN: ", totalin
print 'total time for GET: ', totalget
Run Code Online (Sandbox Code Playgroud)
我们可以做一些更好的计时:
import timeit
d = dict.fromkeys(range(10000))
def d_get_has(d):
return d.get(1)
def d_get_not_has(d):
return d.get(-1)
def d_in_has(d):
if 1 in d:
return d[1]
def d_in_not_has(d):
if -1 in d:
return d[-1]
print timeit.timeit('d_get_has(d)', 'from __main__ import d, d_get_has')
print timeit.timeit('d_get_not_has(d)', 'from __main__ import d, d_get_not_has')
print timeit.timeit('d_in_has(d)', 'from __main__ import d, d_in_has')
print timeit.timeit('d_in_not_has(d)', 'from __main__ import d, d_in_not_has')
Run Code Online (Sandbox Code Playgroud)
在我的电脑上,“in”变体比.get变体更快。这可能是因为.getdict 上的属性查找和属性查找可能与 dict 上的成员资格测试一样昂贵。请注意,使用in和项目查找dict[x]可以直接在字节码中完成,因此可以绕过正常的方法查找...
还值得指出的是,如果我只使用 pypy :-),我会得到一个巨大的优化:
$ python ~/sandbox/test.py
0.169840812683
0.1732609272
0.122044086456
0.0991759300232
$ pypy ~/sandbox/test.py
0.00974893569946
0.00752687454224
0.00812077522278
0.00686597824097
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
6293 次 |
| 最近记录: |