You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Сборка мусора: гибридный подход между счётчиком ссылок и подходом без него.
Идея: вводим
Идея: счётчик ссылок остаётся, но его ненулевое значение в
Hybrid Garbage Collector
This PEP proposes to add the tracing tri-color garbage collector with accounting for nonzero refcount.
Abstract
Currently the CPython runtime manages object lifetime using the reference counter.
The proposal covers CPython internals only. It does not define any public API; it may be done by a separate PEP after:
Tuning of the implementation to achieve maximum performance; internal API may be breken here
Wide public discussion across maintainers of third party modules and alternative Python implementations.
Implementation Summary
This PEP introduces two parallel systems to track a lifetime of :class:PyObject\ s:
the tracing garbage collector with tri-color marking as a primary one
the reference counter for compatibility with existing third-party code
To not allow a false positive destruction of an object, these systems are kept in sync:
when an object is created, the collector clones a strong reference with Py_NewRef and writes the clone into its reachability forest
if another run of the collector states that the object is no longer reachable
... and refcount==1 (the collector is the only holder of the object), the collector calls Py_DECREF thus destructing the object
... and refcount>1 (there is some other C-API holder), the collector reassigns the object to a special Python object thus keeping the object and its referents available for next scans
while the garbage collector system sees the object as reachable, the collector keeps an extra reference in the reference counter.
a verdict of the collector is maintained as an extra "reference" in the reference counter
any other reference in the reference counter is maintained as an ingoing reference from some ephemeral global object
the garbage collector aquires an extra refcount reference when the object becomes reachable in its tree, and releases the refer
for objects seen by the new garbage collector as reachable, the old reference counter is increased by one. It is required for :func:Py_INCREF macro u both
However, there is code both outside and inside Python that use for low-level decrease
of the reference counter and destruction. To accomodate it, the following synchronization i
for objects with the refcounter
reachability in the garbage collector tree affects the refcounter as an extra reference
any other refcounter references are treated as an ingoing reference from some ephemeral global object
an object reachable through the garbage collector tree
For the whole
Here’s a high-level look at the implementation:
A lifetime is determined using tracing garbage collection with tri-color marking:
each object belongs to one of three states: unreachable ("while"), reachable with no outgoing references ("black"), and reachable with no outgoing references ("gray")
.. further description of the algorithm
!!!ref1:
To account third party code directly accessing the reference counter, each object also has a reference count with...
if we initialize with zero, we get destruction bu third party code and use-after-free reports
so we MUST initialize with 1 meaning "referr".
!!!
For CPython, a non-zero value of a reference counter is treated as an ingoing reference from some ephemeral global object. , its value .
We maintain three per-interpreter sets: white (for removal), black (reachable from a current python code scope),
The collector shall be not called when execution is inside module or object C code. In other words, the collector
may be called from inside of an interpreter loop only.
The reachability tree starts from the current function scope. As a result, all literals need to be fields of this global object to not be garbage collected.
What this PEP does not cover
removal of a refcount field
memory mobility and generations
retirement of the :opt:--with-trace-refs build; objects created by the old way still need to be accounted
It's just about determination whether an object can be destroyed, i. e. lifetime determination.
Motivation
cyclic references (plus the whole Lib/test/leakers can be removed)
faster garbage collection: show numbers
removal of multiple PY_INCREF and PY_DECREF
removal of the whole concept of immortal objects - both objects and the corresponding affinity is untouched if we ling them
removal of the whole concept of a borrowed link
from an invisible field of a python context
can postpone memory reclamation to free memory outside of tight, performance critical loops, in bulk
check if incref/decref takes the GIL
check when we may reclaim resources
freeing memory is not cheap
ideally we make the reclamation once, when the program ends
find a balance between used memory included free heap blocks and invocation
maybe: may move to fixed-chunk linked-list custom memory allocator
starting with Python 4:
:opt:--with-trace-refs build gets simplified by reusing the reference tree instead of maintaining its own PyObject._ob_prev/PyObject._ob_next machinery
Specification
The approach involves these fundamental changes:
(add a sweeper class)
(add a timer to run the sweep in a main thread)
(extend :opt:--with-trace-refs to scan both its own PyObject._ob_next/PyObject._ob_prev and the reachability tree)
Reachability tree is a child-sibling tree with the following fields:
struct _Py_ReachabilityTreeNode {
uint32_t first_child_id; // array ID to allow tree expansion with reallocation
uint32_t next_sibling_id; // array ID to allow tree expansion with reallocation
PyObject *object_sibling;
};
The tree nodes are stored in an array of the following structure:
union {
_Py_ReachabilityTreeNode node;
uint32_t free_node_next;
}
For C API objects we have no access to their internal structure but have a visitor
slot tp_traverse PyTypeObject.structseq_traverse that we can use instead. Indeed,
we need to store status of each object (white/black/gray). The white set can be stored
as a single linked list between all elements to be removed; the black set is not
explicitly marked; the gray list can be stored in the same single linked list but with
another start pointer.
Backward compatibility
No API or ABI change of the public interface.
Compatible changes of the private interface -- what's about interpreter class?
Changes in observable behavior:
debuggers and leak detectors report underestimated reference count; however a living object will always have a positive refcount
Reference implementation
Migration plan
Right after this PEP is implemented, nothing changes:
all objects including CPython-created ones continue to rely on a reference counter
the garbage collector works in the background; it immediately binds the objects to the special root node
the safety net of one-off reference counter has no use yet
Then, as <refcounting methods https://docs.python.org/3/c-api/refcounting.html>_ and comments on borrowed refs are removed,
the safety net of the one-off reference starts to take more and more load until CPython stops touching the reference counter altogether.
However, C API working with stolen references must be reconsidered:
reduce usage of stolen references as mush as possible
call :func:Py_DECREF after passing the stolen reference to the garbage collector
Todo: externally created objects that do not support traverse
Rejected ideas
Getting rid of reference counters
Using PyObject._ob_prev/PyObject._ob_next -
Available in :opt:--with-trace-refs builds only that are binary incompatible with the release build
Until the reference count is retired, :opt:--with-trace-refs further need a way to account them
faster-cpython/ideas#376
pep-9999-hybrid-garbage-collector.txt
Сборка мусора: гибридный подход между счётчиком ссылок и подходом без него.
Идея: вводим
Идея: счётчик ссылок остаётся, но его ненулевое значение в
Hybrid Garbage Collector
This PEP proposes to add the tracing tri-color garbage collector with accounting for nonzero refcount.
Abstract
Currently the CPython runtime manages object lifetime using the reference counter.
The proposal covers CPython internals only. It does not define any public API; it may be done by a separate PEP after:
Implementation Summary
This PEP introduces two parallel systems to track a lifetime of :class:
PyObject
\ s:To not allow a false positive destruction of an object, these systems are kept in sync:
when an object is created, the collector clones a strong reference with Py_NewRef and writes the clone into its reachability forest
if another run of the collector states that the object is no longer reachable
while the garbage collector system sees the object as reachable, the collector keeps an extra reference in the reference counter.
a verdict of the collector is maintained as an extra "reference" in the reference counter
any other reference in the reference counter is maintained as an ingoing reference from some ephemeral global object
the garbage collector aquires an extra refcount reference when the object becomes reachable in its tree, and releases the refer
for objects seen by the new garbage collector as reachable, the old reference counter is increased by one. It is required for :func:
Py_INCREF
macro u bothHowever, there is code both outside and inside Python that use for low-level decrease
of the reference counter and destruction. To accomodate it, the following synchronization i
for objects with the refcounter
reachability in the garbage collector tree affects the refcounter as an extra reference
any other refcounter references are treated as an ingoing reference from some ephemeral global object
an object reachable through the garbage collector tree
For the whole
Here’s a high-level look at the implementation:
A lifetime is determined using tracing garbage collection with tri-color marking:
!!!ref1:
To account third party code directly accessing the reference counter, each object also has a reference count with...
For CPython, a non-zero value of a reference counter is treated as an ingoing reference from some ephemeral global object. , its value .
We maintain three per-interpreter sets: white (for removal), black (reachable from a current python code scope),
The collector shall be not called when execution is inside module or object C code. In other words, the collector
may be called from inside of an interpreter loop only.
The reachability tree starts from the current function scope. As a result, all literals need to be fields of this global object to not be garbage collected.
What this PEP does not cover
--with-trace-refs
build; objects created by the old way still need to be accountedIt's just about determination whether an object can be destroyed, i. e. lifetime determination.
Motivation
Lib/test/leakers
can be removed)from an invisible field of a python context
--with-trace-refs
build gets simplified by reusing the reference tree instead of maintaining its ownPyObject._ob_prev
/PyObject._ob_next
machinerySpecification
The approach involves these fundamental changes:
--with-trace-refs
to scan both its ownPyObject._ob_next
/PyObject._ob_prev
and the reachability tree)Reachability tree is a child-sibling tree with the following fields:
The tree nodes are stored in an array of the following structure:
modifications of https://docs.python.org/3/library/gc.html here
For C API objects we have no access to their internal structure but have a visitor
slot tp_traverse PyTypeObject.structseq_traverse that we can use instead. Indeed,
we need to store status of each object (white/black/gray). The white set can be stored
as a single linked list between all elements to be removed; the black set is not
explicitly marked; the gray list can be stored in the same single linked list but with
another start pointer.
Backward compatibility
No API or ABI change of the public interface.
Compatible changes of the private interface -- what's about interpreter class?
Changes in observable behavior:
Reference implementation
Migration plan
Right after this PEP is implemented, nothing changes:
Then, as <refcounting methods https://docs.python.org/3/c-api/refcounting.html>_ and comments on borrowed refs are removed,
the safety net of the one-off reference starts to take more and more load until CPython stops touching the reference counter altogether.
However, C API working with stolen references must be reconsidered:
Py_DECREF
after passing the stolen reference to the garbage collectorTodo: externally created objects that do not support traverse
Rejected ideas
Getting rid of reference counters
Using PyObject._ob_prev/PyObject._ob_next -
--with-trace-refs
builds only that are binary incompatible with the release build--with-trace-refs
further need a way to account themhttps://twitter.com/VictorStinner/status/1481783100191428611
The text was updated successfully, but these errors were encountered: