Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Function must not return a pointer to content without an explicit resource management #57

Closed
vstinner opened this issue Jul 10, 2023 · 5 comments

Comments

@vstinner
Copy link
Contributor

The Python C API has many functions giving a direct access to an object content as a pointer:

  • PyBytes_AsString(), PyBytes_AS_STRING()
  • PyByteArray_AsString(), PyByteArray_AS_STRING()
  • PyEval_GetFuncName()
  • PyUnicode_AsUTF8(), PyUnicode_AsUTF8AndSize()
  • PyCapsule_GetName()

Problem: the returned pointer becomes a dangling pointer if the object is deleted. It's not explicit for the caller how long the pointer remains valid.

For PyUnicode_AsUTF8(), the caller can guess that the pointer remains valid until the Python str is deleted (until the last strong reference is deleted).

Problem: These API make assumptions on how Python objects are implemented. What if tomorrow we want to remove the UTF-8 cached string of Python str object?

See "C API: Add PyObject_AsObjectArray() function: get tuple/list items as PyObject** array" issue python/cpython#106592 to have a different look at this problem.

@vstinner
Copy link
Contributor Author

I proposed a generic PyResource API: python/cpython#106592

It makes resource management explict and make the C API more generic. Don't rely on the exact current object layout, allow changing the layout.

@vstinner
Copy link
Contributor Author

See also Unclear lifetimes of borrowed references: issue #5.

@scoder
Copy link

scoder commented Oct 18, 2023

I see this mostly as a documentation/naming issue. IMHO, the fact that the pointer lifetime is handled by CPython because it is directly tied to a live Python object makes this very easy to deal with. It just needs to be clear in which functions/cases this happens. Keeping a life reference to a Python object is something that people do all the time, so this is a trivial concept, much simpler than a new "handle this non-object resource" API. And it can always trivially be implemented with an additional pointer in the object, even if the internal layout changes.

@iritkatriel iritkatriel removed the v label Oct 23, 2023
@encukou
Copy link
Contributor

encukou commented Oct 24, 2023

Proposed guideline issues:

(These do focus on explicitly documenting the lifetime of a return value. IMO, doing that will reveal the pathological cases.)

@vstinner
Copy link
Contributor Author

I abandoned my PEP draft: Add PyResource callback C API to close resources:

  • PyBytes_AsString() result lifetime is well defined: it's valid until the bytes object is destroyed.
  • Other developers prefer a specific API rather than a generic "PyResource" API to release a resource.
  • There is also the PyBuffer API which can be reused.

I close the issue.

What can be done is to document (even) better until when a pointer is valid.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants