You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Once the work to allow any object as the "code" of a frame is done, we can take advantage of that to speed up creation of code objects from serialized data.
The idea is that the serialized data will consist of two parts:
A sequence of immutable bytecode
Supporting binary data.
Creation of the top-level (module) code object would be done as follows:
Create a "module initializer" object, consisting of a pointer to the binary data and debug info like the name and filename.
Create a frame, setting the "code" field to the module initializer and setting the instruction to point at the instructions.
Start executing in the interpreter.
What are the advantages of this?
Marshal is slow
There is no need for a secondary interpreter (marshal)
It allow partial deep-freezing, meaning that the names and consts arrays can be deep frozen without requiring that the code object is deep frozen. The resulting constant can be loaded with LOAD_COMMON_CONST.
It allows further improvements, e.g. we could skip creating a code object for the module, just creating them for functions.
It decouples the pyc format from marshal, allowing them to be improved separately.
Common objects can be shared very efficiently, by leaving them on the stack and using COPY instead of MAKE_...
Creating the instruction sequence
We can create the instruction in much the same way as marshal serializes; recursively emitting code for sub-objects until the entire object is complete.
To do this will need some new instructions and a few new instrinsics.
New general purpose instructions:
LOAD_COMMON_CONST Loads a constant from the global array containing None, True, etc plus assorted common constants
LOAD_COMMON_NAME Like LOAD_COMMON_NAME but from an array of strings.
LOAD_INT Loads a small int
Insructions to create objects from binary data.
These instructions will create an object from the binary data, advancing the pointer.
MAKE_FLOAT
MAKE_STRING
MAKE_LONG (we could build large ints from small ints, but that would be quadratic)
MAKE_BYTES
MAKE_CODE: Creates a code object from values on the stack (name, qualname, names, consts) and binary data
New instrinsic functions
make_complex (2)
make_frozenset (1)
We already have an instruction for making tuples.
The instruction sequence would finish with MAKE_CODE; RETURN_VALUE returning the completed instruction on the stack.
Or, we could add another instruction, START_CODE at the end to execute the code object and return the completed module.
Creation of a code object would look like something like this:
(Code to create names tuple)
(Code to create consts tuple)
MAKE_STRING name
MAKE_STRING qualname
COPY n (filename will be shared for all code objects in module)
MAKE_CODE
The text was updated successfully, but these errors were encountered:
Once the work to allow any object as the "code" of a frame is done, we can take advantage of that to speed up creation of code objects from serialized data.
The idea is that the serialized data will consist of two parts:
Creation of the top-level (module) code object would be done as follows:
What are the advantages of this?
LOAD_COMMON_CONST
.COPY
instead ofMAKE_...
Creating the instruction sequence
We can create the instruction in much the same way as marshal serializes; recursively emitting code for sub-objects until the entire object is complete.
To do this will need some new instructions and a few new instrinsics.
New general purpose instructions:
None
,True
, etc plus assorted common constantsLOAD_COMMON_NAME
but from an array of strings.Insructions to create objects from binary data.
These instructions will create an object from the binary data, advancing the pointer.
New instrinsic functions
We already have an instruction for making tuples.
The instruction sequence would finish with
MAKE_CODE; RETURN_VALUE
returning the completed instruction on the stack.Or, we could add another instruction,
START_CODE
at the end to execute the code object and return the completed module.Examples
Creation of the tuple
(1, "a", 37.0, (2, "foo"))
Creation of a code object would look like something like this:
The text was updated successfully, but these errors were encountered: