我想知道如何修改字节码,然后重新编译该代码,以便我可以在python中使用它作为一个函数?我一直在努力:
a = """
def fact():
a = 8
a = 0
"""
c = compile(a, '<string>', 'exec')
w = c.co_consts[0].co_code
dis(w)
Run Code Online (Sandbox Code Playgroud)
反编译为:
0 LOAD_CONST 1 (1)
3 STORE_FAST 1 (1)
6 LOAD_CONST 2 (2)
9 STORE_FAST 1 (1)
12 LOAD_CONST 0 (0)
15 RETURN_VALUE
Run Code Online (Sandbox Code Playgroud)
假设我想摆脱第0和第3行,我打电话给:
x = c.co_consts[0].co_code[6:16]
dis(x)
Run Code Online (Sandbox Code Playgroud)
这导致:
0 LOAD_CONST 2 (2)
3 STORE_FAST 1 (1)
6 LOAD_CONST 0 (0)
9 RETURN_VALUE
Run Code Online (Sandbox Code Playgroud)
我的问题是如何处理x,如果我尝试exec x我得到一个'没有nullbytes的预期字符串,我得到相同的exec w,尝试编译x结果:compile()期望字符串没有空字节.
我不知道最好的方法是什么,除了我可能需要创建某种代码对象,但我不知道如何,但我认为它必须是可能的字节游戏,python汇编程序等
我正在使用python 2.7.10,但如果可能的话,我希望将来兼容(例如python 3).
roc*_*cky 11
更新:出于各种原因,我已经开始编写一个跨Python版本的汇编程序.请参阅https://github.com/rocky/python-xasm它仍处于早期测试阶段.
据我所知,目前没有维护的 Python汇编程序.PEAK的Bytecode反汇编程序是为Python 2.6开发的,后来经过修改以支持早期的Python 2.7.
从文档中可以很酷.但它依赖于其他可能存在问题的PEAK库.
我将通过整个例子让你感受到你必须做的事情.它不漂亮,但你应该期待.
基本上在修改字节码之后,您需要创建一个新types.CodeType对象.你需要一个新的,因为代码类型中的许多对象,有充分的理由,你无法改变.例如,解释器可能缓存了一些这些对象值.
创建代码后,您可以在使用可以在exec或中使用的代码类型的函数中使用它eval.
或者您可以将其写入字节码文件.唉,Python 2和Python 3之间的代码格式发生了变化.顺便提一下,优化和字节码也是如此.事实上,在Python 3.6中,它们将是字码而不是字节码.
所以这就是你要为你的例子做的事情:
a = """
def fact():
a = 8
a = 0
return a
"""
c = compile(a, '<string>', 'exec')
fn_code = c.co_consts[0] # Pick up the function code from the main code
from dis import dis
dis(fn_code)
print("=" * 30)
x = fn_code.co_code[6:16] # modify bytecode
import types
opt_fn_code = types.CodeType(fn_code.co_argcount,
# c.co_kwonlyargcount, Add this in Python3
fn_code.co_nlocals,
fn_code.co_stacksize,
fn_code.co_flags,
x, # fn_code.co_code: this you changed
fn_code.co_consts,
fn_code.co_names,
fn_code.co_varnames,
fn_code.co_filename,
fn_code.co_name,
fn_code.co_firstlineno,
fn_code.co_lnotab, # In general, You should adjust this
fn_code.co_freevars,
fn_code.co_cellvars)
dis(opt_fn_code)
print("=" * 30)
print("Result is", eval(opt_fn_code))
# Now let's change the value of what's returned
co_consts = list(opt_fn_code.co_consts)
co_consts[-1] = 10
opt_fn_code = types.CodeType(fn_code.co_argcount,
# c.co_kwonlyargcount, Add this in Python3
fn_code.co_nlocals,
fn_code.co_stacksize,
fn_code.co_flags,
x, # fn_code.co_code: this you changed
tuple(co_consts), # this is now changed too
fn_code.co_names,
fn_code.co_varnames,
fn_code.co_filename,
fn_code.co_name,
fn_code.co_firstlineno,
fn_code.co_lnotab, # In general, You should adjust this
fn_code.co_freevars,
fn_code.co_cellvars)
dis(opt_fn_code)
print("=" * 30)
print("Result is now", eval(opt_fn_code))
Run Code Online (Sandbox Code Playgroud)
当我在这里跑这是我得到的:
3 0 LOAD_CONST 1 (8)
3 STORE_FAST 0 (a)
4 6 LOAD_CONST 2 (0)
9 STORE_FAST 0 (a)
5 12 LOAD_FAST 0 (a)
15 RETURN_VALUE
==============================
3 0 LOAD_CONST 2 (0)
3 STORE_FAST 0 (a)
4 6 LOAD_FAST 0 (a)
9 RETURN_VALUE
==============================
('Result is', 0)
3 0 LOAD_CONST 2 (10)
3 STORE_FAST 0 (a)
4 6 LOAD_FAST 0 (a)
9 RETURN_VALUE
==============================
('Result is now', 10)
Run Code Online (Sandbox Code Playgroud)
请注意,即使我在代码中删除了几行,行号也没有改变.那是因为我没有更新fn_code.co_lnotab.
如果你想现在写一个Python字节码文件.这是你要做的:
co_consts = list(c.co_consts)
co_consts[0] = opt_fn_code
c1 = types.CodeType(c.co_argcount,
# c.co_kwonlyargcount, Add this in Python3
c.co_nlocals,
c.co_stacksize,
c.co_flags,
c.co_code,
tuple(co_consts),
c.co_names,
c.co_varnames,
c.co_filename,
c.co_name,
c.co_firstlineno,
c.co_lnotab, # In general, You should adjust this
c.co_freevars,
c.co_cellvars)
from struct import pack
with open('/tmp/testing.pyc', 'w') as fp:
fp.write(pack('Hcc', 62211, '\r', '\n')) # Python 2.7 magic number
import time
fp.write(pack('I', int(time.time())))
# In Python 3 you need to write out the size mod 2**32 here
import marshal
fp.write(marshal.dumps(c1))
Run Code Online (Sandbox Code Playgroud)
为了简化上面的样板字节码的编写,我在xdis中添加了一个名为write_python_file()的例程.
现在检查结果:
$ uncompyle6 /tmp/testing.pyc
# uncompyle6 version 2.9.2
# Python bytecode 2.7 (62211)
# Disassembled from: Python 2.7.12 (default, Jul 26 2016, 22:53:31)
# [GCC 5.4.0 20160609]
# Embedded file name: <string>
# Compiled at: 2016-10-18 05:52:13
def fact():
a = 0
# okay decompiling /tmp/testing.pyc
$ pydisasm /tmp/testing.pyc
# pydisasm version 3.1.0
# Python bytecode 2.7 (62211) disassembled from Python 2.7
# Timestamp in code: 2016-10-18 05:52:13
# Method Name: <module>
# Filename: <string>
# Argument count: 0
# Number of locals: 0
# Stack size: 1
# Flags: 0x00000040 (NOFREE)
# Constants:
# 0: <code object fact at 0x7f815843e4b0, file "<string>", line 2>
# 1: None
# Names:
# 0: fact
2 0 LOAD_CONST 0 (<code object fact at 0x7f815843e4b0, file "<string>", line 2>)
3 MAKE_FUNCTION 0
6 STORE_NAME 0 (fact)
9 LOAD_CONST 1 (None)
12 RETURN_VALUE
# Method Name: fact
# Filename: <string>
# Argument count: 0
# Number of locals: 1
# Stack size: 1
# Flags: 0x00000043 (NOFREE | NEWLOCALS | OPTIMIZED)
# Constants:
# 0: None
# 1: 8
# 2: 10
# Local variables:
# 0: a
3 0 LOAD_CONST 2 (10)
3 STORE_FAST 0 (a)
4 6 LOAD_CONST 0 (None)
9 RETURN_VALUE
$
Run Code Online (Sandbox Code Playgroud)
另一种优化方法是在抽象语法树级别(AST)进行优化.我不知道你是如何从AST生成字节码文件的.所以我想如果可能的话,你把它写成Python源代码.
但请注意,某些类型的优化(如尾递归消除)可能会使字节码保留为无法以真正忠实的方式转换为源代码的形式.请参阅我的pycon2018哥伦比亚闪电谈话,我制作的视频在字节码中消除尾部递归,以了解我在这里谈论的内容.
| 归档时间: |
|
| 查看次数: |
2159 次 |
| 最近记录: |