Hi,
I've been reading the Python VM sources over the last few afternoons and
I took some notes, which I thought I'd share (and if anyone familiar
with the VM internals could have a quick look at them, I'd really
appreciate it).
Cheers,
-jakob
Unless otherwise noted, the source file in question is Python/ceval.c.
Control Flow
The calling sequence is:
main() (in python.c) -> Py_Main() (main.c) -> PyRun_FooFlags() (pythonrun.c) ->
run_bar() (pythonrun.c) -> PyEval_EvalCode() (ceval.c) -> PyEval_EvalCodeEx()
(ceval.c) -> PyEval_EvalFrameEx() (ceval.c).
EvalCodeEx() does some initialization (creating a new execution frame,
argument processing, and some generator-specific stuff I haven't looked at yet)
before calling EvalFrameEx() which contains the main interpreter loop.
Threads
===
PyEval_InitThreads() initializes the GIL (interpreter_lock) and sets
main_thread to the (threading package dependent) ID of the current thread.
Thread switching is done using PyThreadState_Swap(), which sets
_PyThreadState_Current (both defined in pystate.c) and PyThreadState_GET()
(an alias for _PyThreadState_Current) (pystate.h).
Async Callbacks
===
Asynchronous callbacks can be registered by adding the function to be called
to pendingcalls[] (see Py_AddPendingCall()). The state of this queue is
communicated to the main loop via things_to_do.
State
=
The global state is recorded in a (per-process?) PyInterpreterState struct and
a per-thread PyThreadState struct.
Each execution frame's state is contained in that frame's PyFrameObject
(which includes the instruction stream, the environment (globals, locals,
builtins, etc.), the value stack and so forth).
EvalFrameEx()'s local variables are initialized from this frame object.
Instruction Stream
==
The instruction stream looks as follows (c.f. assemble_emit() in compile.c):
A byte stream where each instruction consists of either
1) a single byte opcode: OP
2) a single byte opcode plus a two-byte immediate argument: OP LO HI
3) a special opcode followed by the real opcode followed by a four byte
argument: EXTENDED_ARG OP BYTE2 BYTE1 BYTE4 BYTE3
Opcode Prediction
=
One nice trick used to speed up opcode dispatch is the following:
Using the macros PREDICT() and PREDICTED() it is sometimes possible
to jump directly to the code implementing the next instruction
rather than having to go through the whole loop preamble, e.g.
case FOO:
// ...
PREDICT(BAR);
continue;
PREDICTED(BAR);
case BAR:
// ...
expands to
case FOO:
// ...
if (*next_instr == BAR) goto PRED_BAR;
continue;
PRED_BAR: next_instr++;
case BAR:
// ...
Main Loop
=
Variables and macros used in EvalFrameEx()
--
The value stack:
PyObject **stack_pointer;
The instruction stream:
unsigned char *next_instr;
NEXTOP(), NEXTARG(), PEEKARG(), JUMPTO(), and JUMPBY() simply fiddle
with next_instr. Likewise for TOP(), SET_SECOND(), PUSH(), POP(),
etc. and stack_pointer.
Current opcode plus argument:
int opcode;
int oparg;
Error status:
enum why_code why; // no, exn, exn re-raised, return, break, continue, yield
int err; // non-zero is error
The environment:
PyObject *names;
PyObject *consts;
and
PyObject **fastlocals;
which is accessed via GETLOCAL() and SETLOCAL().
Finally, there are some more PyObject *'s (v, w, u, and so forth, used
as temporary variables) as well as
PyObject *retval;
Basic structure
---
EvalFrameEx() {
why = WHY_NOT;
err = 0;
for (;;) {<--+---+
// do periodic tasks | |
| |
fast_next_opcode:| |
opcode = NEXTOP(); | |
if (HAS_ARG(opcode)) | |
oparg = NEXTARG(); | |
| |
dispatch_opcode: | |
switch(opcode) { | |
| |
continue; ---+ |
|
break; --+ |
| |
// Also, opcode prediction | |
// jumps around inside the | |
// switch statement | |
| |
}<---+ |
|
on_error:|
// no error: continue ---+
// otherwise why == WHY_EXCEPTION after this
fast_block_end:
// unwind stacks if there was an error
}
// more unwinding
fast_yield:
// reset current thread's exception info
exit_eval_frame:
// set thread's execution frame to previous execution frame
return retval;
}
Periodic Tasks
--
By checking and decrementing _Py_Ticker, the main loop