32.12. dis — Python 字节码反汇编器

Source code: Lib/dis.py


dis 模块通过反汇编支持CPython的 bytecode 分析。该模块作为输入的 CPython 字节码在文件 Include/opcode.h 中定义,并由编译器和解释器使用。

CPython implementation detail: Bytecode is an implementation detail of the CPython interpreter! No guarantees are made that bytecode will not be added, removed, or changed between versions of Python. Use of this module should not be considered to work across Python VMs or Python releases.

示例:给出函数 myfunc():

def myfunc(alist):
    return len(alist)

the following command can be used to get the disassembly of myfunc():

>>> dis.dis(myfunc)
  2           0 LOAD_GLOBAL              0 (len)
              3 LOAD_FAST                0 (alist)
              6 CALL_FUNCTION            1
              9 RETURN_VALUE

(“2” 是行号)。

The dis module defines the following functions and constants:

dis.dis([bytesource])

Disassemble the bytesource object. bytesource can denote either a module, a class, a method, a function, or a code object. For a module, it disassembles all functions. For a class, it disassembles all methods. For a single code sequence, it prints one line per bytecode instruction. If no object is provided, it disassembles the last traceback.

dis.distb([tb])

Disassembles the top-of-stack function of a traceback, using the last traceback if none was passed. The instruction causing the exception is indicated.

dis.disassemble(code[, lasti])

Disassembles a code object, indicating the last instruction if lasti was provided. The output is divided in the following columns:

  1. 行号,用于每行的第一条指令

  2. 当前指令,表示为 -->

  3. 一个标记的指令,用 >> 表示,

  4. 指令的地址,

  5. 操作码名称,

  6. 操作参数,和

  7. 括号中的参数解释。

参数解释识别本地和全局变量名称、常量值、分支目标和比较运算符。

dis.disco(code[, lasti])

A synonym for disassemble(). It is more convenient to type, and kept for compatibility with earlier Python releases.

dis.findlinestarts(code)

This generator function uses the co_firstlineno and co_lnotab attributes of the code object code to find the offsets which are starts of lines in the source code. They are generated as (offset, lineno) pairs.

dis.findlabels(code)

Detect all offsets in the code object code which are jump targets, and return a list of these offsets.

dis.opname

操作名称序列,可使用字节码来索引。

dis.opmap

映射操作名称到字节码的字典

dis.cmp_op

所有比较操作名称的序列。

dis.hasconst

访问常量的字节码序列。

dis.hasfree

Sequence of bytecodes that access a free variable.

dis.hasname

按名称访问属性的字节码序列

dis.hasjrel

具有相对跳转目标的字节码序列。

dis.hasjabs

具有绝对跳转目标的字节码序列。

dis.haslocal

访问局部变量的字节码序列。

dis.hascompare

布尔运算的字节码序列

32.12.1. Python字节码说明

Python编译器当前生成以下字节码指令。

STOP_CODE()

Indicates end-of-code to the compiler, not used by the interpreter.

NOP()

什么都不做。 用作字节码优化器的占位符。

POP_TOP()

删除堆栈顶部(TOS)项。

ROT_TWO()

交换两个最顶层的堆栈项。

ROT_THREE()

将第二个和第三个堆叠项目向上提升一个位置,向上移动到位置三。

ROT_FOUR()

Lifts second, third and forth stack item one position up, moves top down to position four.

DUP_TOP()

复制堆栈顶部的引用。

Unary Operations take the top of the stack, apply the operation, and push the result back on the stack.

UNARY_POSITIVE()

实现 TOS = +TOS

UNARY_NEGATIVE()

实现 TOS = -TOS

UNARY_NOT()

实现 TOS = not TOS

UNARY_CONVERT()

Implements TOS = `TOS`.

UNARY_INVERT()

实现 TOS = ~TOS

GET_ITER()

实现 TOS = iter(TOS)

二元操作从堆栈中删除堆栈顶部(TOS)和第二个最顶层堆栈项(TOS1)。 它们执行操作,并将结果放回堆栈。

BINARY_POWER()

实现 TOS = TOS1 ** TOS

BINARY_MULTIPLY()

实现 TOS = TOS1 * TOS

BINARY_DIVIDE()

Implements TOS = TOS1 / TOS when from __future__ import division is not in effect.

BINARY_FLOOR_DIVIDE()

实现 TOS = TOS1 // TOS

BINARY_TRUE_DIVIDE()

Implements TOS = TOS1 / TOS when from __future__ import division is in effect.

BINARY_MODULO()

实现 TOS = TOS1 % TOS

BINARY_ADD()

实现 TOS = TOS1 + TOS

BINARY_SUBTRACT()

实现 TOS = TOS1 - TOS

BINARY_SUBSCR()

实现 TOS = TOS1[TOS]

BINARY_LSHIFT()

实现 TOS = TOS1 << TOS

BINARY_RSHIFT()

实现 TOS = TOS1 >> TOS

BINARY_AND()

实现 TOS = TOS1 & TOS

BINARY_XOR()

实现 TOS = TOS1 ^ TOS

BINARY_OR()

实现 TOS = TOS1 | TOS

就地操作就像二元操作,因为它们删除了TOS和TOS1,并将结果推回到堆栈上,但是当TOS1支持它时,操作就地完成,并且产生的TOS可能是(但不一定) 原来的TOS1。

INPLACE_POWER()

就地实现 TOS = TOS1 ** TOS

INPLACE_MULTIPLY()

就地实现 TOS = TOS1 * TOS

INPLACE_DIVIDE()

Implements in-place TOS = TOS1 / TOS when from __future__ import division is not in effect.

INPLACE_FLOOR_DIVIDE()

就地实现 TOS = TOS1 // TOS

INPLACE_TRUE_DIVIDE()

Implements in-place TOS = TOS1 / TOS when from __future__ import division is in effect.

INPLACE_MODULO()

就地实现 TOS = TOS1 % TOS

INPLACE_ADD()

就地实现 TOS = TOS1 + TOS

INPLACE_SUBTRACT()

就地实现 TOS = TOS1 - TOS

INPLACE_LSHIFT()

就地实现 TOS = TOS1 << TOS

INPLACE_RSHIFT()

就地实现 TOS = TOS1 >> TOS

INPLACE_AND()

就地实现 TOS = TOS1 & TOS

INPLACE_XOR()

就地实现 TOS = TOS1 ^ TOS

INPLACE_OR()

就地实现 TOS = TOS1 | TOS

The slice opcodes take up to three parameters.

SLICE+0()

Implements TOS = TOS[:].

SLICE+1()

Implements TOS = TOS1[TOS:].

SLICE+2()

Implements TOS = TOS1[:TOS].

SLICE+3()

Implements TOS = TOS2[TOS1:TOS].

Slice assignment needs even an additional parameter. As any statement, they put nothing on the stack.

STORE_SLICE+0()

Implements TOS[:] = TOS1.

STORE_SLICE+1()

Implements TOS1[TOS:] = TOS2.

STORE_SLICE+2()

Implements TOS1[:TOS] = TOS2.

STORE_SLICE+3()

Implements TOS2[TOS1:TOS] = TOS3.

DELETE_SLICE+0()

Implements del TOS[:].

DELETE_SLICE+1()

Implements del TOS1[TOS:].

DELETE_SLICE+2()

Implements del TOS1[:TOS].

DELETE_SLICE+3()

Implements del TOS2[TOS1:TOS].

STORE_SUBSCR()

实现 TOS1[TOS] = TOS2

DELETE_SUBSCR()

实现 del TOS1[TOS]

Miscellaneous opcodes.

PRINT_EXPR()

实现交互模式的表达式语句。TOS从堆栈中被移除并打印。在非交互模式下,表达式语句以 POP_TOP 终止。

PRINT_ITEM()

Prints TOS to the file-like object bound to sys.stdout. There is one such instruction for each item in the print statement.

PRINT_ITEM_TO()

Like PRINT_ITEM, but prints the item second from TOS to the file-like object at TOS. This is used by the extended print statement.

PRINT_NEWLINE()

Prints a new line on sys.stdout. This is generated as the last operation of a print statement, unless the statement ends with a comma.

PRINT_NEWLINE_TO()

Like PRINT_NEWLINE, but prints the new line on the file-like object on the TOS. This is used by the extended print statement.

BREAK_LOOP()

Terminates a loop due to a break statement.

CONTINUE_LOOP(target)

Continues a loop due to a continue statement. target is the address to jump to (which should be a FOR_ITER instruction).

LIST_APPEND(i)

Calls list.append(TOS[-i], TOS). Used to implement list comprehensions. While the appended value is popped off, the list object remains on the stack so that it is available for further iterations of the loop.

LOAD_LOCALS()

Pushes a reference to the locals of the current scope on the stack. This is used in the code for a class definition: After the class body is evaluated, the locals are passed to the class definition.

RETURN_VALUE()

返回 TOS 到函数的调用者。

YIELD_VALUE()

Pops TOS and yields it from a generator.

IMPORT_STAR()

将所有不以 '_' 开头的符号直接从模块 TOS 加载到本地名称空间。加载所有名称后弹出该模块。这个操作码实现了 from module import *

EXEC_STMT()

Implements exec TOS2,TOS1,TOS. The compiler fills missing optional parameters with None.

POP_BLOCK()

Removes one block from the block stack. Per frame, there is a stack of blocks, denoting nested loops, try statements, and such.

END_FINALLY()

Terminates a finally clause. The interpreter recalls whether the exception has to be re-raised, or whether the function returns, and continues with the outer-next block.

BUILD_CLASS()

Creates a new class object. TOS is the methods dictionary, TOS1 the tuple of the names of the base classes, and TOS2 the class name.

SETUP_WITH(delta)

This opcode performs several operations before a with block starts. First, it loads __exit__() from the context manager and pushes it onto the stack for later use by WITH_CLEANUP. Then, __enter__() is called, and a finally block pointing to delta is pushed. Finally, the result of calling the enter method is pushed onto the stack. The next opcode will either ignore it (POP_TOP), or store it in (a) variable(s) (STORE_FAST, STORE_NAME, or UNPACK_SEQUENCE).

WITH_CLEANUP()

Cleans up the stack when a with statement block exits. On top of the stack are 1–3 values indicating how/why the finally clause was entered:

  • TOP = None

  • (TOP, SECOND) = (WHY_{RETURN,CONTINUE}), retval

  • TOP = WHY_*; no retval below it

  • (TOP, SECOND, THIRD) = exc_info()

Under them is EXIT, the context manager’s __exit__() bound method.

In the last case, EXIT(TOP, SECOND, THIRD) is called, otherwise EXIT(None, None, None).

EXIT is removed from the stack, leaving the values above it in the same order. In addition, if the stack represents an exception, and the function call returns a ‘true’ value, this information is “zapped”, to prevent END_FINALLY from re-raising the exception. (But non-local gotos should still be resumed.)

All of the following opcodes expect arguments. An argument is two bytes, with the more significant byte last.

STORE_NAME(namei)

Implements name = TOS. namei is the index of name in the attribute co_names of the code object. The compiler tries to use STORE_FAST or STORE_GLOBAL if possible.

DELETE_NAME(namei)

实现 del name ,其中 namei 是代码对象的 co_names 属性的索引。

UNPACK_SEQUENCE(count)

将 TOS 解包为 count 个单独的值,它们将按从右至左的顺序被放入堆栈。

DUP_TOPX(count)

Duplicate count items, keeping them in the same order. Due to implementation limits, count should be between 1 and 5 inclusive.

STORE_ATTR(namei)

实现 TOS.name = TOS1,其中 namei 是 name 在 co_names 中的索引号。

DELETE_ATTR(namei)

实现 del TOS.name,使用 namei 作为 co_names 中的索引号。

STORE_GLOBAL(namei)

Works as STORE_NAME, but stores the name as a global.

DELETE_GLOBAL(namei)

Works as DELETE_NAME, but deletes a global name.

LOAD_CONST(consti)

co_consts[consti] 推入栈顶。

LOAD_NAME(namei)

将与 co_names[namei] 相关联的值推入栈顶。

BUILD_TUPLE(count)

创建一个使用了来自栈的 count 个项的元组,并将结果元组推入栈顶。

BUILD_LIST(count)

Works as BUILD_TUPLE, but creates a list.

BUILD_SET(count)

Works as BUILD_TUPLE, but creates a set.

2.7 新版功能.

BUILD_MAP(count)

Pushes a new dictionary object onto the stack. The dictionary is pre-sized to hold count entries.

LOAD_ATTR(namei)

将 TOS 替换为 getattr(TOS, co_names[namei])

COMPARE_OP(opname)

执行布尔运算操作。 操作名称可在 cmp_op[opname] 中找到。

IMPORT_NAME(namei)

Imports the module co_names[namei]. TOS and TOS1 are popped and provide the fromlist and level arguments of __import__(). The module object is pushed onto the stack. The current namespace is not affected: for a proper import statement, a subsequent STORE_FAST instruction modifies the namespace.

IMPORT_FROM(namei)

Loads the attribute co_names[namei] from the module found in TOS. The resulting object is pushed onto the stack, to be subsequently stored by a STORE_FAST instruction.

JUMP_FORWARD(delta)

将字节码计数器的值增加 delta

POP_JUMP_IF_TRUE(target)

如果 TOS 为真值,则将字节码计数器的值设为 target。 TOS 会被弹出。

POP_JUMP_IF_FALSE(target)

如果 TOS 为假值,则将字节码计数器的值设为 target。 TOS 会被弹出。

JUMP_IF_TRUE_OR_POP(target)

如果 TOS 为真值,则将字节码计数器的值设为 target 并将 TOS 留在栈顶。 否则(如 TOS 为假值),TOS 会被弹出。

JUMP_IF_FALSE_OR_POP(target)

如果 TOS 为假值,则将字节码计数器的值设为 target 并将 TOS 留在栈顶。 否则(如 TOS 为假值),TOS 会被弹出。

JUMP_ABSOLUTE(target)

将字节码计数器的值设为 target

FOR_ITER(delta)

TOS is an iterator. Call its next() method. If this yields a new value, push it on the stack (leaving the iterator below it). If the iterator indicates it is exhausted TOS is popped, and the bytecode counter is incremented by delta.

LOAD_GLOBAL(namei)

加载名称为 co_names[namei] 的全局对象推入栈顶。

SETUP_LOOP(delta)

Pushes a block for a loop onto the block stack. The block spans from the current instruction with a size of delta bytes.

SETUP_EXCEPT(delta)

Pushes a try block from a try-except clause onto the block stack. delta points to the first except block.

SETUP_FINALLY(delta)

Pushes a try block from a try-except clause onto the block stack. delta points to the finally block.

STORE_MAP()

Store a key and value pair in a dictionary. Pops the key and value while leaving the dictionary on the stack.

LOAD_FAST(var_num)

将指向局部对象 co_varnames[var_num] 的引用推入栈顶。

STORE_FAST(var_num)

将 TOS 存放到局部变量 co_varnames[var_num]

DELETE_FAST(var_num)

移除局部对象 co_varnames[var_num]

LOAD_CLOSURE(i)

将一个包含在单元的第 i 个空位中的对单元的引用推入栈顶并释放可用的存储空间。 如果 i 小于 co_cellvars 的长度则变量的名称为 co_cellvars[i]。 否则为 co_freevars[i - len(co_cellvars)]

LOAD_DEREF(i)

加载包含在单元的第 i 个空位中的单元并释放可用的存储空间。 将一个对单元所包含对象的引用推入栈顶。

STORE_DEREF(i)

将 TOS 存放到包含在单元的第 i 个空位中的单元内并释放可用存储空间。

SET_LINENO(lineno)

This opcode is obsolete.

RAISE_VARARGS(argc)

Raises an exception. argc indicates the number of arguments to the raise statement, ranging from 0 to 3. The handler will find the traceback as TOS2, the parameter as TOS1, and the exception as TOS.

CALL_FUNCTION(argc)

Calls a callable object. The low byte of argc indicates the number of positional arguments, the high byte the number of keyword arguments. The stack contains keyword arguments on top (if any), then the positional arguments below that (if any), then the callable object to call below that. Each keyword argument is represented with two values on the stack: the argument’s name, and its value, with the argument’s value above the name on the stack. The positional arguments are pushed in the order that they are passed in to the callable object, with the right-most positional argument on top. CALL_FUNCTION pops all arguments and the callable object off the stack, calls the callable object with those arguments, and pushes the return value returned by the callable object.

MAKE_FUNCTION(argc)

Pushes a new function object on the stack. TOS is the code associated with the function. The function object is defined to have argc default parameters, which are found below TOS.

MAKE_CLOSURE(argc)

Creates a new function object, sets its func_closure slot, and pushes it on the stack. TOS is the code associated with the function, TOS1 the tuple containing cells for the closure’s free variables. The function also has argc default parameters, which are found below the cells.

BUILD_SLICE(argc)

将一个切片对象推入栈顶。 argc 必须为 2 或 3。 如果为 2,则推入 slice(TOS1, TOS);如果为 3,则推入 slice(TOS2, TOS1, TOS)。 请参阅 slice() 内置函数了解详细信息。

EXTENDED_ARG(ext)

Prefixes any opcode which has an argument too big to fit into the default two bytes. ext holds two additional bytes which, taken together with the subsequent opcode’s argument, comprise a four-byte argument, ext being the two most-significant bytes.

CALL_FUNCTION_VAR(argc)

Calls a callable object, similarly to CALL_FUNCTION. argc represents the number of keyword and positional arguments, identically to CALL_FUNCTION. The top of the stack contains an iterable object containing additional positional arguments. Below that are keyword arguments (if any), positional arguments (if any) and a callable object, identically to CALL_FUNCTION. Before the callable object is called, the iterable object is “unpacked” and its contents are appended to the positional arguments passed in. The iterable object is ignored when computing the value of argc.

CALL_FUNCTION_KW(argc)

Calls a callable object, similarly to CALL_FUNCTION. argc represents the number of keyword and positional arguments, identically to CALL_FUNCTION. The top of the stack contains a mapping object containing additional keyword arguments. Below that are keyword arguments (if any), positional arguments (if any) and a callable object, identically to CALL_FUNCTION. Before the callable is called, the mapping object at the top of the stack is “unpacked” and its contents are appended to the keyword arguments passed in. The mapping object at the top of the stack is ignored when computing the value of argc.

CALL_FUNCTION_VAR_KW(argc)

Calls a callable object, similarly to CALL_FUNCTION_VAR and CALL_FUNCTION_KW. argc represents the number of keyword and positional arguments, identically to CALL_FUNCTION. The top of the stack contains a mapping object, as per CALL_FUNCTION_KW. Below that is an iterable object, as per CALL_FUNCTION_VAR. Below that are keyword arguments (if any), positional arguments (if any) and a callable object, identically to CALL_FUNCTION. Before the callable is called, the mapping object and iterable object are each “unpacked” and their contents passed in as keyword and positional arguments respectively, identically to CALL_FUNCTION_VAR and CALL_FUNCTION_KW. The mapping object and iterable object are both ignored when computing the value of argc.

HAVE_ARGUMENT()

This is not really an opcode. It identifies the dividing line between opcodes which don’t take arguments < HAVE_ARGUMENT and those which do >= HAVE_ARGUMENT.