Tho*_*mas 2 python dictionary key
在尝试优化模仿树结构的程序的速度时(“树”存储在以笛卡尔坐标 x,y 坐标对作为键的 DICT 中),我发现将它们的唯一地址作为元组存储在字典中,而不是与字符串相比,运行时间要快得多。
我的问题是,如果 Python 针对字典和哈希中的字符串键进行了优化,为什么在这个示例中使用元组会快得多?在执行完全相同的任务时,字符串键似乎要花费 60% 的时间。我在我的例子中忽略了一些简单的事情吗?
我引用这个线程作为我的问题的基础(以及其他同样断言字符串更快的线程):使用字符串作为字典中的键总是更快吗?
下面是我用来测试这些方法并计时的代码:
import time
def writeTuples():
k = {}
for x in range(0,500):
for y in range(0,x):
k[(x,y)] = "%s,%s"%(x,y)
return k
def readTuples(k):
failures = 0
for x in range(0,500):
for y in range(0,x):
if k.get((x,y)) is not None: pass
else: failures += 1
return failures
def writeStrings():
k = {}
for x in range(0,500):
for y in range(0,x):
k["%s,%s"%(x,y)] = "%s,%s"%(x,y)
return k
def readStrings(k):
failures = 0
for x in range(0,500):
for y in range(0,x):
if k.get("%s,%s"%(x,y)) is not None: pass
else: failures += 1
return failures
def calcTuples():
clockTimesWrite = []
clockTimesRead = []
failCounter = 0
trials = 100
st = time.clock()
for x in range(0,trials):
startLoop = time.clock()
k = writeTuples()
writeTime = time.clock()
failCounter += readTuples(k)
readTime = time.clock()
clockTimesWrite.append(writeTime-startLoop)
clockTimesRead.append(readTime-writeTime)
et = time.clock()
print("The average time to loop with tuple keys is %f, and had %i total failed records"%((et-st)/trials,failCounter))
print("The average write time is %f, and average read time is %f"%(sum(clockTimesWrite)/trials,sum(clockTimesRead)/trials))
return None
def calcStrings():
clockTimesWrite = []
clockTimesRead = []
failCounter = 0
trials = 100
st = time.clock()
for x in range(0,trials):
startLoop = time.clock()
k = writeStrings()
writeTime = time.clock()
failCounter += readStrings(k)
readTime = time.clock()
clockTimesWrite.append(writeTime-startLoop)
clockTimesRead.append(readTime-writeTime)
et = time.clock()
print("The average time to loop with string keys is %f, and had %i total failed records"%((et-st)/trials,failCounter))
print("The average write time is %f, and average read time is %f"%(sum(clockTimesWrite)/trials,sum(clockTimesRead)/trials))
return None
calcTuples()
calcStrings()
Run Code Online (Sandbox Code Playgroud)
谢谢!
测试的权重不公平(因此存在时间差异)。format您在循环中对writeStrings循环的调用次数是循环中的两倍writeTuples,并且在 中对它的调用次数是无限多的readStrings。为了进行更公平的测试,您需要确保:
%两个写循环仅对每个内部循环进行一次调用readStrings两者readTuples都对每个内部循环进行一次或零次调用%。