Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PEP 689 -- Add an unstable C-API tier #91744

Closed
8 of 19 tasks
encukou opened this issue Apr 20, 2022 · 30 comments
Closed
8 of 19 tasks

PEP 689 -- Add an unstable C-API tier #91744

encukou opened this issue Apr 20, 2022 · 30 comments
Assignees

Comments

@encukou
Copy link
Member

encukou commented Apr 20, 2022

On python-dev, it became clear that it would be useful to have a “semi-stable” “unstable” C-API tier which will stay stable within a minor release.

  • This API will go in a new directory (Include/unstable/) - see Nick's reply
    You'll need to #define Py_USING_UNSTABLE_API.
    (name still up for bikeshedding).

  • Since we're nearing Beta and there's no rush to break things, in 3.11
    you only get a warning if you try to use it without the opt-in #define.
    In 3.12 it'll fail.

  • The functions will be renamed to drop the leading underscore. The old
    names will be available, and may be removed
    whenever the API changes. (Ideally, the underscore should always mark
    API that's fully private with no guarantees at all.)

  • Get SC approval (this is now PEP-689 that will need acceptance)

  • The API will be stable during a minor release. (As usual, for extreme
    cases, exceptions are possible with SC approval.)

  • Docs:

  • All ununstable API should be tested

    • PEP-523 PyCode_* functions: tested in test_code
    • ...

What should be here:

  • Functions added in PEP-523
  • PyCode_New, PyCode_NewWithPosOnlyArgs
  • Ideally anything documented as subject to change between minor
    releases. (To be kind to users, if something is added later we should
    again have one release of compiler warnings before requiring the opt-in.
    Unless that API just changed and users would get errors anyway.)
  • Cinder's dict writing API looks like it would feel at home here: add support for watching writes to selected dictionaries #91052 (comment)
  • PyBytesObject.ob_shash sounds like a good candidate: see https://discuss.python.org/t/15108
  • non-opaque access to frame structs and any other key APIs needed to
    implement alternate eval loops with comparable performance to the default
    eval loop (unless & until we can figure out stable public APIs that can
    deliver equivalent performance) - see Nick's reply
  • C APIs that provide access to compiled code whether in AST or opcode
    form (the API itself may be stable, but the compiled code isn't, so this is
    kinda covered by your last point) - see Nick's reply
  • _Py_HashDouble, see comment below

First PR: #91789

@encukou encukou self-assigned this Apr 20, 2022
@vstinner
Copy link
Member

This API will go in a new directory (Include/semi-stable/) - see Nick's reply

For the directory name, I suggest to omit - or _ and just put the two words as semistable, it's easier to type.

C compilers don't handle well header files with the same names in different directories. For Include/cpython/, all header files fail with #error if they are included directly. IMO we can do the same here to ensure that Include/semistable/ is not added to the search paths by mistake.

Another option is to add a prefix to each file, like pycore_ for the internal C API. For example, Include/internal/pycore_object.h is related to Include/object.h. But that's more annoying and confusing.

I would prefer that C extensions just include the main <Python.h> and that's it (no change). I would prefer that C extensions don't include directly <semistable/frame.h> for example.

@vstinner
Copy link
Member

vstinner commented Apr 20, 2022

UPDATE: I completed my list, I forgot some removed functions.

Document the new opt-in macro

When building Python itself, does the Py_BUILD_CORE macro imply defining the Py_USING_SEMI_STABLE_API macro?

For me, the internal API (Py_BUILD_CORE) is a superset of semi-stable API (Py_USING_SEMI_STABLE_API).

APIs that provide access to compiled code whether in AST or opcode form (the API itself may be stable, but the compiled code isn't, so this is kinda covered by your last point)

In Python 3.10, Python-ast.h was moved to the internal C API as pycore_ast.h, and many API related to AST have been moved to the internal C API:

  • PyST_GetScope()
  • PySymtable_Build()
  • PySymtable_BuildObject()
  • PySymtable_Free()
  • Py_SymtableString()
  • Py_SymtableStringObject()
  • PyAST_Compile()
  • PyAST_CompileEx()
  • PyAST_CompileObject()
  • PyFuture_FromAST()
  • PyFuture_FromASTObject()
  • PyParser_ASTFromFile()
  • PyParser_ASTFromFileObject()
  • PyParser_ASTFromFilename()
  • PyParser_ASTFromString()
  • PyParser_ASTFromStringObject()
  • PyArena_New()
  • PyArena_Free()
  • PyArena_Malloc()
  • PyArena_AddPyObject()
  • see: https://docs.python.org/dev/whatsnew/3.10.html#id4

Are there remaining "AST related" functions in the C API?

Obviously, compile() can still be called with PyCF_ONLY_AST flag, or call ast.parse(). But this API returns Python objects, not C structures.

@vstinner
Copy link
Member

I would be nice to move the PyTypeObject to the semi-stable API, but IMO it's not possible to remove it from the default API because way too many C extensions use it. Modify all these extensions to define Py_USING_SEMI_STABLE_API is just annoying.

@encukou
Copy link
Member Author

encukou commented Apr 20, 2022

OK, I'll do semistable/, and I already have #error in my WIP.

When building Python itself, does the Py_BUILD_CORE macro imply defining the Py_USING_SEMI_STABLE_API macro?

Yes.

Are there remaining "AST related" functions in the C API?

@ncoghlan?

I would be nice to move the PyTypeObject to the semi-stable API

Definitely not in 3.11.

@encukou
Copy link
Member Author

encukou commented Apr 20, 2022

The old underscore-prefixed names can be deprecated static inline functions, rather than #defines as in the original plan.

@ncoghlan
Copy link
Contributor

ncoghlan commented Apr 20, 2022

I forgot (or simply didn't notice) that the AST APIs had been migrated to the internal API.

I still think they'd be good candidates for being declared semi-stable - part of the rationale for the new tier is reducing the temptation for 3rd parties to include the internal API headers, and these were considered useful enough to be public APIs prior to 3.10

@vstinner
Copy link
Member

Which projects use the C API of AST functions removed in Python 3.10? What is the advantage of adding back this removed API?

These functions were undocumented, excluded from the limited C API and had no test. The removal is related to the PEG parser PEP 617 which removed the old parser. The implementation modified the C code and in C, it was no longer possible to create struct _mod (mod_ty) which were arguments of the removed AST functions.

In Python 3.8, you needed the parser to get a node* object to then get a mod_ty object. But the whole C API of the parser has been removed in Python 3.10: the struct _node (node) type has been removed.

In Python 3.10 and 3.11, it's possible to browse AST using the Python API. Just use the Python ast module. IMO it's enough. There is no need to add a public C API for that.

@vstinner
Copy link
Member

The old underscore-prefixed names can be deprecated static inline functions, rather than #defines as in the original plan.

If there is no plan to deprecate the old names, macros are fine. Like the current code in Include/cpython/abstract.h:

// Backwards compatibility aliases for API that was provisional in Python 3.8
#define _PyObject_Vectorcall PyObject_Vectorcall
#define _PyObject_VectorcallMethod PyObject_VectorcallMethod
#define _PyObject_FastCallDict PyObject_VectorcallDict
#define _PyVectorcall_Function PyVectorcall_Function
#define _PyObject_CallOneArg PyObject_CallOneArg
#define _PyObject_CallMethodNoArgs PyObject_CallMethodNoArgs
#define _PyObject_CallMethodOneArg PyObject_CallMethodOneArg

By the way, IMO it's time to deprecate these aliases :-)

Then nice advantage of a static inline function is that it can be decorated with Py_DEPRECATED(3.11).

@encukou
Copy link
Member Author

encukou commented Apr 20, 2022

Haha, _PyInterpreterState_GetEvalFrameFunc and _PyInterpreterState_SetEvalFrameFunc, are not mentioned in PEP 523 at all, but their docs say See PEP 523...

I guess they fall under Functions added in PEP-523. Off with their underscores!

@vstinner
Copy link
Member

Haha, _PyInterpreterState_GetEvalFrameFunc and _PyInterpreterState_SetEvalFrameFunc, are not mentioned in PEP 523 at all, but their docs say See PEP 523...

These functions should be made public in the semi-stable API. The PyInterperterState structure was made opaque in Python 3.8 (released in 2019), PEP 523 was approved in 2016. I added these functions to keep PEP 523 usable. Otherwise, you had to include the internal C API to use the PEP, hum...

@encukou
Copy link
Member Author

encukou commented Apr 21, 2022

@vstinner I see the function is documented to take PyFrameObject: https://docs.python.org/3.10/c-api/init.html?highlight=pyframeevalfunction#c._PyFrameEvalFunction
But it actually takes a _PyInterpreterFrame. Should that be moved to semi-stable API too?

@encukou
Copy link
Member Author

encukou commented Apr 21, 2022

I found the commit that changed it and asked there: #88756 (comment)

@vstinner
Copy link
Member

If you ask me, I suggest to move these functions and type to the semi-stable API:

  • _PyFrameEvalFunction type
  • _PyInterpreterState_GetEvalFrameFunc()
  • _PyInterpreterState_SetEvalFrameFunc()
  • PyCode_New()
  • PyCode_NewWithPosOnlyArgs()

I would prefer to not expose _PyEval_EvalFrameDefault() directly. It should be the default value of _PyInterpreterState_GetEvalFrameFunc(). But if someone considers that it should be exposed, I'm also fine with that.

@vstinner
Copy link
Member

@ncoghlan
Copy link
Contributor

I don't feel strongly about the AST returning functions, since folks are more likely to use the Python API to run AST compilation rather than defining Py_BUILD_CORE. Declaring them semi-stable doesn't seem harmful either, though.

And yes, both the frame object struct and the interpreter frame struct will likely need to be part of the semi stable API if alternative eval loops are going to be able to match (or exceed) the performance of the default eval loop.

@encukou
Copy link
Member Author

encukou commented Apr 21, 2022

I'll split this into more PRs, first one focusing on the new tier itself and another for the rest of the items in.

@encukou
Copy link
Member Author

encukou commented Apr 21, 2022

PR: #91789

@vstinner
Copy link
Member

I don't feel strongly about the AST returning functions, since folks are more likely to use the Python API to run AST compilation rather than defining Py_BUILD_CORE. Declaring them semi-stable doesn't seem harmful either, though.

You didn't give a concrete example of what you want to expose.

And yes, both the frame object struct and the interpreter frame struct will likely need to be part of the semi stable API if alternative eval loops are going to be able to match (or exceed) the performance of the default eval loop.

So far, I'm not aware of any alternative eval loop plug into Python with PEP 523. Python bytecode evolves very quickly. I'm only aware of this API being used to plug debuggers.

I would prefer to keep PyInterpreterState in the internal C API. This structure is now giant and requires very complex structures. For example, atomic variables requires pycore_atomic.h which causes issues on compilers.

PyFrameObject is becoming more and more complex and also evolved quickly. I'm not sure that it's a good idea to expose it.

I would prefer to not expose anything :-) I mean, only add a function to the semi-stable if there is really no other way.

For PyInterpreterState and PyFrameObject, there is a way: add getter and setter functions, there is not need to expose the full structures.

@encukou
Copy link
Member Author

encukou commented Apr 22, 2022

So, What's New was updated to say:

:c:func:_PyFrameEvalFunction now takes _PyInterpreterFrame* as its second parameter, instead of PyFrameObject*. See :pep:523 for more details of how to use this function pointer type.

But that's pretty confusing, PEP 523 does not mention _PyInterpreterFrame!

If I understand correctly, _PyInterpreterFrame needs to stay private, and _PyFrameEvalFunction needs to take it so _PyEval_EvalFrameDefault can stay fast.
But if users can access it, it needs to be exposed (as an opaque structure, of course).

Would it be appropriate to also put _PyFrame_GetFrameObject in the semi-stable API, and say use PyFrame_GetFrameObject to convert _PyInterpreterFrame to PyFrameObject?

@vstinner
Copy link
Member

You should ask @markshannon and users of these API.

@encukou encukou changed the title Add a semi-stable C-API tier PEP 689 -- Add a semi-stable C-API tier Apr 27, 2022
@seberg
Copy link
Contributor

seberg commented Apr 27, 2022

Since NumPy uses it, does it make sense to move _Py_HashDouble into the semi stable API? We also seem to use _PyDict_GetItemStringWithError, although that is easy to just ship on our own (we do for older versions anyway). This is a quick search, I may be missing things. Or maybe just put them into the public API directly?

@encukou
Copy link
Member Author

encukou commented Apr 28, 2022

_PyDict_GetItemStringWithError is used in only a few places in CPython, frequently for static strings. IMO it would be better for CPython to reevaluate using it.
_Py_HashDouble looks OK for semi-stable API, but I wouldn't make it fully public -- its handling of NaN, which is the reason it takes an extra PyObject*, is a detail we might want to change.

@encukou encukou changed the title PEP 689 -- Add a semi-stable C-API tier PEP 689 -- Add an unstable C-API tier May 3, 2022
@jpe
Copy link

jpe commented Jul 12, 2022

I'm porting a debugger that uses a _PyFrameEvalFunction and ran into the problem of _PyInterpreterFrame being opaque (unless I use the internal headers). I think I can work around this, but for some purposes (including, I think, the motivating case for adding eval_frame of a jit compiler) access to the frame to be evaluated is needed. A variant of _PyFrame_GetFrameObject could be used or it may be possible to maintain the old signature and have a frame passed -- for performance, the core would need to bypass the eval_frame if it is set to the default

@gvanrossum
Copy link
Member

@jpe, What information do you need to get out of the frame? It may make more sense to offer APIs (stable or not) that give you the specific things you are looking for.

@jpe
Copy link

jpe commented Jul 12, 2022

As of now, I need the global and locals. But for something like a jit compiler (I think eval_frame was added for pyjion), you'd need more. As I mentioned I can work around this, but I think that eval_frame currently isn't very useful and either should be deprecated / removed or made to work reasonably.

@gvanrossum
Copy link
Member

If your issue is that PEP 623 isn't the right API, I think you're preaching to the choir, but I'm not sure how much PEP 689 can help -- it isn't the place to design a new API.

@jpe
Copy link

jpe commented Jul 12, 2022

My issue is I don't think the eval_frame code in 3.11 is very useful as it stands and think it should be either deprecated / removed or made as useful as it was in 3.10. I suspect if left as it is, it will either not be used or that internal api's will be need to be used.

@gvanrossum
Copy link
Member

Noted. Unfortunately our expert on that topic (@markshannon) is currently on vacation. I think it's too late for code changes in 3.11, so you are likely going to be using internal APIs (which might become "unstable but supported APIs" in 3.12, if PEP 689 is accepted).

@encukou
Copy link
Member Author

encukou commented Aug 10, 2022

I'm deferring the PEP for now.

@encukou encukou closed this as completed Aug 10, 2022
@vstinner
Copy link
Member

Oh, too bad, I liked this PEP. But yeah, it's a lot of work :-(

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants