IRC channel : #hpy on libera.chat
Mailing list: [email protected]
The goal of the project is to design a better API for extending Python in C. The current API is specific to the current implementation of CPython: it exposes a lot of internal details which makes it hard:
-
to implement it for other Python implementations (e.g. PyPy, GraalPython, Jython, IronPython, etc.). The implementation for CPython is in this repo, but other implementations sometimes have to implement some things in their HPy layer.
-
to experiment with new things inside CPython itself: e.g. using a GC instead of refcounting, or to remove the GIL
The goal of this project is to improve the situation by designing a new API which solves some of the current problems.
More specifically, the goals include (but are not necessarily limited to):
-
to be usable on CPython right now with no (or almost no) performance impact
-
to make the adoption incremental: it should be possible to migrate existing C extensions piece by piece and to use the old and the new API side-by-side during the transition
-
to provide better debugging experience: in debug mode, you could get precise notification about which handles are kept open for too long or used after being closed.
-
to be more friendly for other implementations: in particular, we do not want reference counting to be part of the API: we want a more generic way of managing resources which is possible to implement with different strategies, including the existing reference counting and/or with a moving GC (like the ones used by PyPy or Java, for example)
-
to be smaller and easier to study/use/manage than the existing one
-
to avoid to expose internal details of a specific implementation, so that each implementation can experiment with new memory layout of objects, add optimizations, etc.
-
to be written in a way which could make it possible in the future to have a single binary which is ABI-compatible across multiple Python versions and/or multiple implementations
-
internal details might still be available, but in a opt-in way: for example, if Cython wants to iterate over a list of integers, it can ask if the implementation provides a direct low-level access to the content (e.g. in the form of a
int64_t[]
array) and use that. But at the same time, be ready to handle the generic fallback case.
-
we will write a small C library which implements the new API on top of the existing one: no changes to CPython needed
-
PyPy will implement this natively: extensions using this API will be orders of magnitude faster than the ones using the existing old API (see this blog post for details)
-
Cython will adopt this from day one: existing Cython programs will benefit from this automatically
-
the existing C API is becoming a problem for CPython and for the evolution of the language itself: this project makes it possible to make experiments which might be "officially" adopted in the future
-
for PyPy, it will give obvious speed benefits: for example, data scientists will be able to get the benefit of fast C libraries and fast Python code at the same time, something which is hard to achieve now
-
the current implementation is too tied to CPython and proved to be a problem for almost all the other alternative implementations. Having an API which is designed to work well on two different implementations will make the job much easier for future ones: going from 2 to N is much easier than going from 1 to 2
-
arguably, it will be easier to learn and understand than the massive CPython C API
See also Python Performance: Past, Present, Future by Victor Stinner.
The "H" in HPy
stands for "handle": one of the key idea of the new API is to
use fully opaque handles to represent and pass around Python objects.
Become a financial contributor and help us sustain the HPy community: Contribute to HPy.