为什么嵌套循环的顺序之间存在性能差异?

Won*_*ket 12 python loops python-2.x

我有一个循环遍历两个列表的进程,一个相对较大而另一个相当大.

例:

larger_list = list(range(15000))
smaller_list = list(range(2500))

for ll in larger_list:
    for sl in smaller_list:            
        pass
Run Code Online (Sandbox Code Playgroud)

我缩放了列表的大小以测试性能,我注意到哪个列表首先循环之间存在相当大的差异.

import timeit

larger_list = list(range(150))
smaller_list = list(range(25))


def large_then_small():
    for ll in larger_list:
        for sl in smaller_list:
            pass


def small_then_large():
    for sl in smaller_list:
        for ll in larger_list:
            pass


print('Larger -> Smaller: {}'.format(timeit.timeit(large_then_small)))
print('Smaller -> Larger: {}'.format(timeit.timeit(small_then_large)))

>>> Larger -> Smaller: 114.884992572
>>> Smaller -> Larger: 98.7751009799
Run Code Online (Sandbox Code Playgroud)

乍一看,它们看起来完全相同 - 但是这两个功能之间有16秒的差异.

这是为什么?

Ger*_*rat 13

当你拆卸你的一个功能时,你得到:

>>> dis.dis(small_then_large)
  2           0 SETUP_LOOP              31 (to 34)
              3 LOAD_GLOBAL              0 (smaller_list)
              6 GET_ITER
        >>    7 FOR_ITER                23 (to 33)
             10 STORE_FAST               0 (sl)

  3          13 SETUP_LOOP              14 (to 30)
             16 LOAD_GLOBAL              1 (larger_list)
             19 GET_ITER
        >>   20 FOR_ITER                 6 (to 29)
             23 STORE_FAST               1 (ll)

  4          26 JUMP_ABSOLUTE           20
        >>   29 POP_BLOCK
        >>   30 JUMP_ABSOLUTE            7
        >>   33 POP_BLOCK
        >>   34 LOAD_CONST               0 (None)
             37 RETURN_VALUE
>>>
Run Code Online (Sandbox Code Playgroud)

查看地址29和30,看起来这些将在每次内循环结束时执行.两个循环看起来基本相同,但每次内循环退出时都会执行这两个指令.在内部具有较小的数字将导致这些更频繁地执行,因此增加了时间(与内环上的较大数字相比).