-
Notifications
You must be signed in to change notification settings - Fork 6
3.1 CPython's Runtime
- CPython's Run-time Data
- Isolating Interpreter State
- TBD: (Initialization and Finalization)
- TBD: (How the Interpreter Works)
- TBD: (Embedding)
In CPython, we store (persistent) runtime data in several locations:
- shared by all interpreters/threads
-
_PyRuntime
: global info, global resources, and the set of interpreters - C global variables (almost no state)
- process-level resources (e.g. file descriptors, env vars, signals)
-
-
PyInterpreterState
: a distinct Python execution environment -
PyThreadState
: a thread of execution in an interpreter (where the eval loop runs)
The _PyRuntimeState
Definition
Nearly all of CPython's runtime data is stored, directly or indirectly,
within a single global variable: _PyRuntime
(type _PyRuntimeState
, see Include/internal/pycore_runtime.h).
The data was consolidated there from C global variables over several years
(see gh-81057).
This data is shared by all interpreters, though not all of it is used by all interpreters.
The content of _PyRuntime
is coupled with the runtime lifecycle (Py_Initialize()
/Py_Finalize()
).
Mainly, it holds information and resources shared by the main interpreter
and any subinterpreters, along with data related to the runtime lifecycle.
A detailed break-down:
- the runtime lifecycle state (e.g.
initialized
) - hooks tied to the runtime lifecycle (e.g. "atexit" funcs called in
Py_Finalize()
) - effectively const data (info), tied to each runtime init/fini cycle
- set during init
- set lazily after init
- info from
Py_Main()
- state tied to the main thread (and/or main interpreter)
- signal handlers
- tracemalloc
- faulthandler
- the set of interpreters
- (stateful) resources shared by all interpreters
- allocators
- the GIL
- "gilstate"
- imports
- the global import lock
- some extension module metadata
- signal default "handlers"
- other signal state
- perf trampoline (ceval)
- cached objects
- object versions/indices (e.g. dict version)
- effectively const, statically initialized global objects (e.g. small ints, empty tuple)
- diagnostics (e.g. stats, audit hooks)
(Some of that state will be moving to PyInterpreterState
, for better
isolation between interpreters. Some will need to be protected
by granular locks, if the GIL is moved.)
Most of the runtime data/state used to be found in C global variables,
spread throughout the codebase. Starting in 2018, a large portion of
them haved been moved to _PyRuntimeState
. Almost none of the
remaining global variables are tied to the runtime lifecycle. All are
either const or effectively const, with only a few stateful ones.
As with _PyRuntime
, the remaining global variables are shared by all
interpreters, though not all variables are used by all interpreters.
- , which are mostly not shared or are effectively const
A detailed break-down:
- effectively const info about the process's execution environment
- set once before/during the first
Py_Initialize()
- set once lazily
- set once before/during the first
- effectively const, domain-specific information
- set once lazily
- effectivly const Python-specific information (e.g. version string)
- effectively const data set with each runtime init (mostly legacy C-API symbols)
- state tied to
Py_Main()
or the REPL - state tied to the main thread (and/or main interpreter)
- one-off temporary state
See Tools/c-analyzer/cpython/ignored.tsv for the full (categorized) list.
A detailed break-down:
- file descriptors
- env vars
- signals
- ...
The PyInterpreterState
Definition
While _PyRuntimeState
focuses on shared resources used at run-time,
PyInterpreterState
encapsulates the actual execution environment of
the CPython runtime. Furthermore, the distinction allows for running
multiple execution environments concurrently (and potentially in parallel)
which rely on the shared resources provided by _PyRuntimeState
.
At a high-level, the interpreter holds most of the "global" state we
associate with a Python program. At the same time, it defers the state
for executing Python code (the eval loop) to PyThreadState
, supporting
multiple threads at a time.
A detailed break-down:
- interpreter metadata
- the set of Python threads
- GC state
- runtime hooks (e.g. dict watchers, custom eval loop, atexit, auditing)
- warnings state
- import state (e.g. sys.modules)
- ceval state (e.g. eval breaker, pending calls)
- codecs
- object state and caches
Note that some state will be moving here from _PyRuntimeState
.
The thread state is focused on the execution of Python code in an OS thread (i.e. a single running eval loop).
A detailed break-down:
- thread metadata
- recursion state
- tracing/profiling state
- current exception (handled + unhandled)
- an "async" exception to raise
- context state
- the Python code execution stack (incl. current frame)
As noted above, we took some time to consolidate nearly all the runtime
state into _PyRuntimeState
. The subsequent step is to move some of
that state (as much as makes sense) down to PyInterpreterState
.
The original motivation for moving pieces of the runtime state into an interpreter state is to more fully support the use of subinterpreters, though most of the changes necessary are beneficial outside of subinterpreter support in order to reduce cognitive load and unify states when working with runtime state as well as to solve some currently open bugs (which are referenced and identify later).
We'll break down all of the pieces of _PyRuntimeState
,
along with the decisions as the where the piece should live
(in runtime state, interpreter state, or somewhere in between).
(expand)
member | main | "const" |
GIL protects |
object |
interp oriented |
move? | notes |
---|---|---|---|---|---|---|---|
_initialized preinitializing preinitialized core_initialized initialized _finalizing |
X | no | (lifecycle) | ||||
exitfuncs nexitfuncs |
X | X | no | (lifecycle) race on adding | |||
preconfig | X | no | |||||
orig_argv | X | no | |||||
member | main | "const" |
GIL protects |
object |
interp oriented |
move? | notes |
allocators .standard allocators .debug |
??? | no | |||||
allocators .obj_arena | no | default is thread-safe but PyMem allocator requires GIL | |||||
obmalloc .dump_debug_stats | no | no race since op is idempotent and data is bool | |||||
obmalloc .pools obmalloc .mgmt obmalloc .usage |
X | YES | allocator API promises thread-safety, so no races | ||||
pyhash_state | X | no | |||||
time | X | no | |||||
threads | no | ||||||
main_thread | X | no | |||||
signals .handlers | X | X | no | ||||
signals .wakeup | X | no | |||||
signals .is_tripped | no | atomic; only set in main thread | |||||
signals .default_handler signals .ignore_handler |
X | X | YES | ||||
signals .unhandled_keyboard_interrupt | X | ||||||
open_code_hook open_code_userdata |
X | no | set by embedders | ||||
audit_hook_head | X | no |
PySys_AddAuditHook() races |
||||
member | main | "const" |
GIL protects |
object |
interp oriented |
move? | notes |
interpreters | no | ||||||
xidregistry | X | no | |||||
_main_interpreter | X | no | |||||
member | main | "const" |
GIL protects |
object |
interp oriented |
move? | notes |
imports .inittab | X | no | |||||
imports .lock | X | YES | |||||
imports .last_module_index | X | no | for now, will always need a global lock (GIL or granular) | ||||
imports .extensions | X | X | no | no | for now, will always need a global lock (GIL or granular) | ||
imports .pkgcontext | X | no | no | for now, will always need a global lock (GIL or granular) | |||
imports .find_and_load | ??? | X | YES | ||||
member | main | "const" |
GIL protects |
object |
interp oriented |
move? | notes |
ceval .perf | X | no* | no | could be per-interpreter or restricted to main interp but, for now, we'll use a global lock | |||
ceval .signals_pending | X | no | |||||
ceval .gil | X | YES | PEP 684 | ||||
member | main | "const" |
GIL protects |
object |
interp oriented |
move? | notes |
gilstate .check_enabled | X | no | no | maybe can be eliminated | |||
gilstate .tstate_current | X | X | YES | a different solution may be necessary (gil.last_holder ?) |
|||
gilstate .autoInterpreterState | X* | no | no | always the main interpreter; a different solution may be necessary | |||
gilstate .autoTSSkey | X | no | no | ||||
member | main | "const" |
GIL protects |
object |
interp oriented |
move? | notes |
parser .memo_statistics | no | only atomic incr; only used for parser development | |||||
getargs | no | ||||||
dtoa | X | X | X | ||||
fileutils .force_ascii | X | no | |||||
member | main | "const" |
GIL protects |
object |
interp oriented |
move? | notes |
faulthandler .fatal_error | X | X | X* | no | could have some per-interpreter state | ||
faulthandler .thread | X | X | X* | no | could have all per-interpreter state | ||
faulthandler .user_signals | X | X | X* | no | could have all per-interpreter state | ||
faulthandler .stack faulthandler .old_stack |
X | no | |||||
member | main | "const" |
GIL protects |
object |
interp oriented |
move? | notes |
tracemalloc .config | ??? | ??? | ????? | ||||
tracemalloc .allocators | ??? | ??? | ????? | ||||
tracemalloc .tables_lock | ??? | ??? | ????? | race on creation? | |||
tracemalloc .traced_memory tracemalloc .peak_traced_memory |
??? | ????? | |||||
tracemalloc .filenames | ??? | X | X | ????? | |||
tracemalloc .traceback | ??? | X | X | ????? | |||
tracemalloc .tracebacks | ??? | X | X | ????? | |||
tracemalloc .traces | ??? | X | ????? | ||||
tracemalloc .domains | ??? | X | ????? | ||||
tracemalloc .empty_traceback | ??? | X | X | no | |||
tracemalloc .reentrant_key | ??? | X | no | ||||
member | main | "const" |
GIL protects |
object |
interp oriented |
move? | notes |
float_state .float_format float_state .double_format |
X | no | |||||
unicode_state .ids | no | ||||||
dict_state .global_version dict_state .next_keys_version |
X | ??? | ????? | ||||
func_state .next_version | X | ??? | ????? | ||||
types .next_version_tag | X | ??? | ????? | ||||
cached_objects .str_replace_inf | X | X | X | X | YES | ||
cached_objects .interned_strings | X | X | X | YES | |||
static_objects | X | X | no |
file (func) |
variable | main |
"const" (lazy) |
GIL protects |
object |
interp oriented |
move? | notes |
---|---|---|---|---|---|---|---|---|
Modules/posixmodule.c (os_dup2_impl) |
dup3_works | X | X | no | the only race is harmless | |||
Objects/longobject.c (long_from_non_binary_base) |
log_base_BASE convwidth_base convmultmax_base |
X | ??? | no | ||||
Objects/unicodeobject.c | bloom_linebreak | X | no | |||||
Parser/action_helpers.c (_PyPegen_dummy_name) |
cache | X | X | X | no | need to statically define in _PyRuntime | ||
Modules/syslogmodule.c | S_ident_o | X | X | no | ||||
Modules/syslogmodule.c | S_log_open | X | no | |||||
Objects/object.c | _Py_RefTotal | X | X | YES | (more of a copy than a move) | |||
Modules/faulthandler.c (faulthandler_dump_traceback) |
reentrant | ??? | no | no race since only called via signal handlers | ||||
Python/pylifecycle.c (_Py_FatalErrorFormat) |
reentrant | no | effectively no races | |||||
Python/pylifecycle.c (fatal_error) |
reentrant | no | effectively no races |
Immortal:
- ...
Should be immortal:
-
PyModuleDef
objects
Other needs:
- we need a new global import lock to used around all relevant import-related C-API
(old notes)
int initialized;
int core_initialized;
PyThreadState *finalizing;
[ref]
Location Decision: Runtime
struct pyinterpreters {
PyThread_type_lock mutex;
PyInterpreterState *head;
PyInterpreterState *main;
/* _next_interp_id is an auto-numbered sequence of small
integers. It gets initialized in _PyInterpreterState_Init(),
which is called in Py_Initialize(), and used in
PyInterpreterState_New(). A negative interpreter ID
indicates an error occurred. The main interpreter will
always have an ID of 0. Overflow results in a RuntimeError.
If that becomes a problem later then we can adjust, e.g. by
using a Python int. */
int64_t next_id;
} interpreters;
[ref]
Location Decision: Runtime State
struct _xidregistry {
PyThread_type_lock mutex;
struct _xidregitem *head;
} xidregistry;
[ref]
Location Decision: Runtime State
void (*exitfuncs[NEXITFUNCS])(void);
int nexitfuncs;
[ref]
Location Decision: ?
Motivations and Notes
atexit
module: executed when runtime ends, calling all exit functions that have been defined before runtime clean-up occurs. This should probably make use of (or emulate the same behavior) how pending calls are processed.
atexit
functions should be called at the end of the process, not the interpreter. They also must be tied to the subinterpreter that owns the objects on which the exit functions must be called.
Related to
atexit
should thus be called per interpreter at process finalization.
struct _gc_runtime_state gc;
[ref]
Location Decision: Interpreter State
Motivations and Notes If the garbage collector is indeed separate from memory allocation (we need to do some research on this), this can and should be moved to the interpreter state.
_pyruntime
holds static variables that reference fields at runtime. Access sites must change to be relative to the interpreter.
struct _warnings_runtime_state warnings;
[ref]
Location Decision: Runtime state with local tracking
Motivations and Notes There was discourse on how to approach this, landing in two alternatives: warnings should be specific to an interpreter, or at least tracked on a per-interpreter level, to protect objects.
struct _ceval_runtime_state ceval;
[ref]
struct _ceval_runtime_state {
int recursion_limit;
/* Records whether tracing is on for any thread. Counts the number
of threads for which tstate->c_tracefunc is non-NULL, so if the
value is 0, we know we don't have to check this thread's
c_tracefunc. This speeds up the if statement in
PyEval_EvalFrameEx() after fast_next_opcode. */
int tracing_possible;
/* This single variable consolidates all requests to break out of
the fast path in the eval loop. */
_Py_atomic_int eval_breaker;
/* Request for dropping the GIL */
_Py_atomic_int gil_drop_request;
struct _pending_calls pending;
struct _gil_runtime_state gil;
};
[ref]
Location Decision: --
Motivation and Notes GIL must move down to interpreter state.
- Recursion Limit -
recursion_limit
(L31)- Move down to interpreter state to support debugging (e.g. setting a
recursion_limit
on the subinterpreter of instance), but is this really necessary? Something we can defer to later without much trouble, at least. - Location Decision: Runtime state
- Move down to interpreter state to support debugging (e.g. setting a
- Tracing -
tracing_possible
(L37)- Another good tool for debugging -- this could have both a runtime state setting as well as per-interpreter setting in order to support tracing where needed
- Location Decision: Runtime and interpreter state
- Eval Breaker -
eval_breaker
(L40)- This is an optimization in the ceval loop. Each iteration of the ceval loop (or possibly something like every nth time, need to double check this), it checks a few different conditions to decide whether to continue or not.
- Location Decision: Interpreter state
struct _gilstate_runtime_state gilstate;
[ref]
Location Decision: Interpreter state, or runtime state with local tracking.
Motivations and Notes
The current gilstate
allows for use of existing threads.
gilstate
must still allow locking for communication between subinterpreters. See related BPOs.
Related to:
TBD
TBD
TBD