bytecode
– Python bytecode manipulation¶
The bytecode module lets you manipulate python bytecode in a version-independent way. To facilitate this, this module provides a couple of function to disassemble and assemble python bytecode into a high-level representation and some functions to manipulate those structures.
The python version independent function take a py_internals parameter which
represents the specifics of bytecode on that particular version of
python. The pwnypack.py_internals
module provides these internal
specifics for various python versions.
Examples
Disassemble a very simple function, change an opcode and reassemble it:
>>> from pwny import *
>>> import six
>>> def foo(a):
>>> return a - 1
...
>>> print(foo, six.get_function_code(foo).co_code, foo(5))
<function foo at 0x10590ba60> b'| d S' 4
>>> ops = bc.disassemble(foo)
>>> print(ops)
[LOAD_FAST 0, LOAD_CONST 1, BINARY_SUBTRACT, RETURN_VALUE]
>>> ops[2].name = 'BINARY_ADD'
>>> print(ops)
[LOAD_FAST 0, LOAD_CONST 1, BINARY_ADD, RETURN_VALUE]
>>> bar = bc.rebuild_func_from_ops(foo, ops, co_name='bar')
>>> print(bar, six.get_function_code(bar).co_code, bar(5))
<function bar at 0x10590bb70> b'| d S' 6
-
class
pwnypack.bytecode.
AnnotatedOp
(code_obj, name, arg)[source]¶ An annotated opcode description. Instances of this class are generated by
CodeObject.disassemble()
if you set itsannotate
argument toTrue
.It contains more descriptive information about the instruction but cannot be translated back into a bytecode operation at the moment.
This class uses the code object’s reference to the python internals of the python version that it originated from and the properties of the code object to decode as much information as possible.
Parameters: - code_obj (
CodeObject
) – The code object this opcode belongs to. - name (str) – The mnemonic of the opcode.
- arg (int) – The integer argument to the opcode (or
None
).
-
code
= None¶ The numeric opcode.
-
code_obj
= None¶ A reference to the
CodeObject
it belongs to.
-
has_arg
= None¶ Whether this opcode has an argument.
-
has_compare
= None¶ Whether this opcode’s argument is a compare operation.
-
has_const
= None¶ Whether this opcode’s argument is a reference to a constant.
-
has_free
= None¶ Whether this opcode’s argument is a reference to a free or cell var (for closures and nested functions).
-
has_local
= None¶ Whether this opcode’s argument is a reference to a local.
-
has_name
= None¶ Whether this opcode’s argument is a reference to the names table.
-
name
= None¶ The name of the operation.
- code_obj (
-
class
pwnypack.bytecode.
Block
(label=None)[source]¶ A group of python bytecode ops. Produced by
blocks_from_ops()
.Parameters: label ( Label
) – The label of this block. Will beNone
for the first block.-
label
= None¶ The label the block represents.
-
next
= None¶ A pointer to the next block.
-
ops
= None¶ The opcodes contained within this block.
-
-
class
pwnypack.bytecode.
Op
(name, arg=None)[source]¶ Bases:
object
Describes a single bytecode operation.
Parameters: - name (str) – The name of the opcode.
- arg – The argument of the opcode. Should be
None
for opcodes without arguments, should be aLabel
for opcodes that define a jump, should be anint
otherwise.
-
arg
= None¶ The opcode’s argument (or
None
).
-
name
= None¶ The name of the opcode.
-
pwnypack.bytecode.
disassemble
(code, origin=None)[source]¶ Disassemble python bytecode into a series of
Op
andLabel
instances.Parameters: - code (bytes) – The bytecode (a code object’s
co_code
property). You can also provide a function. - origin (dict) – The opcode specification of the python version that
generated
code
. If you provideNone
, the specs for the currently running python version will be used.
Returns: A list of opcodes and labels.
Return type: list
- code (bytes) – The bytecode (a code object’s
-
pwnypack.bytecode.
assemble
(ops, target=None)[source]¶ Assemble a set of
Op
andLabel
instance back into bytecode.Parameters: - ops (list) – A list of opcodes and labels (as returned by
disassemble()
). - target – The opcode specification of the targeted python
version. If this is
None
the specification of the currently running python version will be used.
Returns: The assembled bytecode.
Return type: bytes
- ops (list) – A list of opcodes and labels (as returned by
-
pwnypack.bytecode.
blocks_from_ops
(ops)[source]¶ Group a list of
Op
andLabel
instances by label.Everytime a label is found, a new
Block
is created. The resulting blocks are returned as a dictionary to easily access the target block of a jump operation. The keys of this dictionary will be the labels, the values will be theBlock
instances. The initial block can be accessed by getting theNone
item from the dictionary.Parameters: ops (list) – The list of Op
andLabel
instances (as returned bydisassemble()
.Returns: The resulting dictionary of blocks grouped by label. Return type: dict
-
pwnypack.bytecode.
calculate_max_stack_depth
(ops, target=None)[source]¶ Calculate the maximum stack depth (and required stack size) from a series of
Op
andLabel
instances. This is required when you manipulate the opcodes in such a way that the stack layout might change and you want to re-create a working function from it.This is a fairly literal re-implementation of python’s stackdepth and stackdepth_walk.
Parameters: - ops (list) – A list of opcodes and labels (as returned by
disassemble()
). - target – The opcode specification of the targeted python
version. If this is
None
the specification of the currently running python version will be used.
Returns: The calculated maximum stack depth.
Return type: int
- ops (list) – A list of opcodes and labels (as returned by
-
class
pwnypack.bytecode.
CodeObject
(co_argcount, co_kwonlyargcount, co_nlocals, co_stacksize, co_flags, co_code, co_consts, co_names, co_varnames, co_filename, co_name, co_firstlineno, co_lnotab, co_freevars, co_cellvars, origin=None)[source]¶ Bases:
object
Represents a python code object in a cross python version way. It contains all the properties that exist on code objects on Python 3 (even when run on Python 2).
Parameters: - co_argcount – number of arguments (not including , * or keyword only args)
- co_kwonlyargcount – The keyword-only argument count of this code.
- co_nlocals – number of local variables
- co_stacksize – virtual machine stack space required
- co_flags – bitmap: 1=optimized | 2=newlocals | 4=*arg | 8=**arg
- co_code – string of raw compiled bytecode
- co_consts – tuple of constants used in the bytecode
- co_names – tuple of names of local variables
- co_varnames – tuple of names of arguments and local variables
- co_filename – name of file in which this code object was created
- co_name – name with which this code object was defined
- co_firstlineno – number of first line in Python source code
- co_lnotab – encoded mapping of line numbers to bytecode indices
- co_freevars – tuple of names of closure variables
- co_cellvars – tuple containing the names of local variables that are referenced by nested functions
- origin (dict) – The opcode specification of the python version that
generated the code. If you provide
None
, the specs for the currently running python version will be used.
-
annotate_op
(op)[source]¶ Takes a bytecode operation (
Op
) and annotates it using the data contained in this code object.Parameters: op (Op) – An Op
instance.Returns: An annotated bytecode operation. Return type: AnnotatedOp
-
assemble
(ops, target=None)[source]¶ Assemble a series of operations and labels into bytecode, analyse its stack usage and replace the bytecode and stack size of this code object. Can also (optionally) change the target python version.
Parameters: - ops (list) – The opcodes (and labels) to assemble into bytecode.
- target – The opcode specification of the targeted python
version. If this is
None
the specification of the currently running python version will be used.
Returns: A reference to this
CodeObject
.Return type:
-
disassemble
(annotate=False, blocks=False)[source]¶ Disassemble the bytecode of this code object into a series of opcodes and labels. Can also annotate the opcodes and group the opcodes into blocks based on the labels.
Parameters: - annotate (bool) – Whether to annotate the operations.
- blocks (bool) – Whether to group the operations into blocks.
Returns: A list of
Op
(orAnnotatedOp
) instances and labels.Return type: list
-
classmethod
from_code
(code, co_argcount=BORROW, co_kwonlyargcount=BORROW, co_nlocals=BORROW, co_stacksize=BORROW, co_flags=BORROW, co_code=BORROW, co_consts=BORROW, co_names=BORROW, co_varnames=BORROW, co_filename=BORROW, co_name=BORROW, co_firstlineno=BORROW, co_lnotab=BORROW, co_freevars=BORROW, co_cellvars=BORROW)[source]¶ Create a new instance from an existing code object. The originating internals of the instance will be that of the running python version.
Any properties explicitly specified will be overridden on the new instance.
Parameters: - code (types.CodeType) – The code object to get the properties of.
- .. – The properties to override.
Returns: A new
CodeObject
instance.Return type:
-
classmethod
from_function
(f, *args, **kwargs)[source]¶ Create a new instance from a function. Gets the code object from the function and passes it and any other specified parameters to
from_code()
.Parameters: f (function) – The function to get the code object from. Returns: A new CodeObject
instance.Return type: CodeObject
-
to_code
()[source]¶ Convert this instance back into a native python code object. This only works if the internals of the code object are compatible with those of the running python version.
Returns: The native python code object. Return type: types.CodeType
-
to_function
()[source]¶ Convert this
CodeObject
back into a python function. This only works if the internals of the code object are compatible with those of the running python version.Returns: The newly created python function. Return type: function