Kel*_*ndy 62 python performance cpython python-internals python-3.11
调用 with n = 10**8
,对我来说,简单循环始终比复杂循环慢得多,我不明白为什么:
def simple(n):
while n:
n -= 1
def complex(n):
while True:
if not n:
break
n -= 1
Run Code Online (Sandbox Code Playgroud)
有时以秒为单位:
def simple(n):
while n:
n -= 1
def complex(n):
while True:
if not n:
break
n -= 1
Run Code Online (Sandbox Code Playgroud)
这是字节码的循环部分,如下所示dis.dis(simple)
:
6 >> 6 LOAD_FAST 0 (n)
8 LOAD_CONST 1 (1)
10 BINARY_OP 23 (-=)
14 STORE_FAST 0 (n)
5 16 LOAD_FAST 0 (n)
18 POP_JUMP_BACKWARD_IF_TRUE 7 (to 6)
Run Code Online (Sandbox Code Playgroud)
对于complex
:
10 >> 4 LOAD_FAST 0 (n)
6 POP_JUMP_FORWARD_IF_TRUE 2 (to 12)
11 8 LOAD_CONST 0 (None)
10 RETURN_VALUE
12 >> 12 LOAD_FAST 0 (n)
14 LOAD_CONST 2 (1)
16 BINARY_OP 23 (-=)
20 STORE_FAST 0 (n)
9 22 JUMP_BACKWARD 10 (to 4)
Run Code Online (Sandbox Code Playgroud)
所以看起来复杂的每次迭代都会做更多的工作(两次跳转而不是一次)。那为什么会更快呢?
似乎是Python 3.11的现象,请参阅评论。
基准脚本(在线尝试!):
simple 4.340795516967773
complex 3.6490490436553955
simple 4.374553918838501
complex 3.639145851135254
simple 4.336690425872803
complex 3.624480724334717
Python: 3.11.4 (main, Sep 9 2023, 15:09:21) [GCC 13.2.1 20230801]
Run Code Online (Sandbox Code Playgroud)
Mec*_*Pig 64
我检查了字节码(python 3.11.6)的源代码,发现在反编译的字节码中,似乎只会JUMP_BACKWARD
执行一个warmup函数,当执行足够多的次数时,它将触发python 3.11中的专门化:
PyObject* _Py_HOT_FUNCTION\n_PyEval_EvalFrameDefault(PyThreadState *tstate, _PyInterpreterFrame *frame, int throwflag)\n{\n /* ... */\n TARGET(JUMP_BACKWARD) {\n _PyCode_Warmup(frame->f_code);\n JUMP_TO_INSTRUCTION(JUMP_BACKWARD_QUICK);\n }\n /* ... */\n}\n
Run Code Online (Sandbox Code Playgroud)\nstatic inline void\n_PyCode_Warmup(PyCodeObject *code)\n{\n if (code->co_warmup != 0) {\n code->co_warmup++;\n if (code->co_warmup == 0) {\n _PyCode_Quicken(code);\n }\n }\n}\n
Run Code Online (Sandbox Code Playgroud)\n在所有字节码中,只有JUMP_BACKWARD
和RESUME
will 调用_PyCode_Warmup()
.
专业化似乎可以加快使用多个字节码的速度,从而显着提高速度:
\nvoid\n_PyCode_Quicken(PyCodeObject *code)\n{\n /* ... */\n switch (opcode) {\n case EXTENDED_ARG: /* ... */\n case JUMP_BACKWARD: /* ... */\n case RESUME: /* ... */\n case LOAD_FAST: /* ... */\n case STORE_FAST: /* ... */\n case LOAD_CONST: /* ... */\n }\n /* ... */\n}\n
Run Code Online (Sandbox Code Playgroud)\n执行一次后,while的字节码complex
改变了,而simple
没有:
In [_]: %timeit -n 1 -r 1 complex(10 ** 8)\n2.7 s \xc2\xb1 0 ns per loop (mean \xc2\xb1 std. dev. of 1 run, 1 loop each)\n\nIn [_]: dis(complex, adaptive=True)\n 5 0 RESUME_QUICK 0\n\n 6 2 NOP\n\n 7 4 LOAD_FAST 0 (n)\n 6 POP_JUMP_FORWARD_IF_TRUE 2 (to 12)\n\n 8 8 LOAD_CONST 0 (None)\n 10 RETURN_VALUE\n\n 9 >> 12 LOAD_FAST__LOAD_CONST 0 (n)\n 14 LOAD_CONST 2 (1)\n 16 BINARY_OP_SUBTRACT_INT 23 (-=)\n 20 STORE_FAST 0 (n)\n\n 6 22 JUMP_BACKWARD_QUICK 10 (to 4)\n\n
Run Code Online (Sandbox Code Playgroud)\nIn [_]: %timeit -n 1 -r 1 simple(10 ** 8)\n4.78 s \xc2\xb1 0 ns per loop (mean \xc2\xb1 std. dev. of 1 run, 1 loop each)\n\nIn [_]: dis(simple, adaptive=True)\n 1 0 RESUME 0\n\n 2 2 LOAD_FAST 0 (n)\n 4 POP_JUMP_FORWARD_IF_FALSE 9 (to 24)\n\n 3 >> 6 LOAD_FAST 0 (n)\n 8 LOAD_CONST 1 (1)\n 10 BINARY_OP 23 (-=)\n 14 STORE_FAST 0 (n)\n\n 2 16 LOAD_FAST 0 (n)\n 18 POP_JUMP_BACKWARD_IF_TRUE 7 (to 6)\n 20 LOAD_CONST 0 (None)\n 22 RETURN_VALUE\n >> 24 LOAD_CONST 0 (None)\n 26 RETURN_VALUE\n\n
Run Code Online (Sandbox Code Playgroud)\n
归档时间: |
|
查看次数: |
9685 次 |
最近记录: |