Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

C API: Add PyDict_GetItemRef() function #106004

Closed
vstinner opened this issue Jun 23, 2023 · 7 comments
Closed

C API: Add PyDict_GetItemRef() function #106004

vstinner opened this issue Jun 23, 2023 · 7 comments
Labels
topic-C-API type-bug An unexpected behavior, bug, or error

Comments

@vstinner
Copy link
Member

vstinner commented Jun 23, 2023

The PyDict C API has a bad history. PyDict_GetItem() ignores all exception: error on hash(), error on "key == key2", KeyboardInterrupt, etc. PyDict_GetItemWithError() was added to fix this design. Moreover, Python 3.9 and older allowed to call PyDict_GetItem() with the GIL released.

PyDict_GetItem() returns a borrowed reference which is usually safe since the dictionary still contains a strong reference to the request value. But in general, borrowed references are error prone and can likely lead to complex race conditions causing crashes:

While PyDict_GetItem() calls can be quite easily replaced with PyObject_GetItem() which has a better API (return a new stong reference), developers usually prefer to still use the specialized PyDict API for best performance: avoid the minor overhead of type dispatching, Py_TYPE(obj)->tp_as_mapping->mp_subscript.

I propose adding PyDict_GetItemRef() and PyDict_GetItemStringRef() functions to the limited C API (version 3.13): replacements for PyDict_GetItem(), PyDict_GetItemWithError() and PyDict_GetItemString().

API:

int PyDict_GetItemRef(PyObject *mp, PyObject *key, PyObject **pvalue);
int PyDict_GetItemStringRef(PyObject *mp, const char *key, PyObject **pvalue);

PyDict_GetItemWithError() has another API issue: when it returns NULL, it can mean two things. It returns NULL if the key is missing, but it also returns NULL on error. The caller has to check PyErr_Occurred() to distinguish the two cases (to write correct code). See capi-workgroup/problems#1 Proposed API avoids this by returning an int: return -1 on error, or return 0 otherwise (present or missing key). Checking PyErr_Occurred() is no longer needed.

By the way, the public C API has no PyDict_GetItemStringWithError() function: using PyDict_GetItemWithError() with a char* key is not convenient. The _PyDict_GetItemStringWithError() function exists but it's a private C API.

Linked PRs

@vstinner vstinner added the type-bug An unexpected behavior, bug, or error label Jun 23, 2023
@vstinner
Copy link
Member Author

Example of PyDict_GetItemRef() usage of the function, replacing PyDict_GetItemWithError():

@@ -5146,13 +5191,12 @@ dictitems_contains(_PyDictViewObject *dv, PyObject *obj)
         return 0;
     key = PyTuple_GET_ITEM(obj, 0);
     value = PyTuple_GET_ITEM(obj, 1);
-    found = PyDict_GetItemWithError((PyObject *)dv->dv_dict, key);
+    if (PyDict_GetItemRef((PyObject *)dv->dv_dict, key, &found) < 0) {
+        return -1;
+    }
     if (found == NULL) {
-        if (PyErr_Occurred())
-            return -1;
         return 0;
     }
-    Py_INCREF(found);
     result = PyObject_RichCompareBool(found, value, Py_EQ);
     Py_DECREF(found);
     return result;

PyErr_Occurred() and Py_INCREF() calls can be removed.

vstinner added a commit to vstinner/cpython that referenced this issue Jun 23, 2023
* Add PyDict_GetItemRef() and PyDict_GetItemStringRef() functions.
* Add unit tests on the PyDict C API in test_capi.
@vstinner
Copy link
Member Author

See also my previous attempt in 2020: issue #86460 "Add new C functions with more regular reference counting like PyTuple_GetItemRef()".

vstinner added a commit to vstinner/cpython that referenced this issue Jun 23, 2023
* Add PyDict_GetItemRef() and PyDict_GetItemStringRef() functions.
* Add unit tests on the PyDict C API in test_capi.
vstinner added a commit to vstinner/cpython that referenced this issue Jun 23, 2023
* Add PyDict_GetItemRef() and PyDict_GetItemStringRef() functions.
* Add unit tests on the PyDict C API in test_capi.
@gpshead
Copy link
Member

gpshead commented Jun 29, 2023

I'm in favor of this because I don't think we should have public APIs that (a) require a value check + PyErr_Occurred() call pattern - a frequent source of lurking bugs - or (b) return borrowed references. Yes I know we already have them, that's missing the point. The point is that with these in place, we can promote their use over the others because these are better in all respects.

One possible long term future would have us deprecate the messy borrowing APIs. Having these already in place sooner rather than later smooths that potential transition.

There is no foreseeable future in which we'd go the other way towards all-borrowing-only APIs.

@vstinner
Copy link
Member Author

There is no foreseeable future in which we'd go the other way towards all-borrowing-only APIs.

FYI I'm tracking the list of C API functions returning borrowed references at: https://pythoncapi.readthedocs.io/bad_api.html#functions

I'm trying to provide a replacement for each function, and if possible an efficient replacement. For example, PyDict_GetItem() can be replaced with PyObject_GetItem(), but it's less efficient (that's what I explained in my first message).

vstinner added a commit to vstinner/cpython that referenced this issue Jun 29, 2023
* Add PyDict_GetItemRef() and PyDict_GetItemStringRef() functions.
* Add unit tests on the PyDict C API in test_capi.
@serhiy-storchaka
Copy link
Member

The only problem is that functions with so similar names have completely different interface. It is pretty confusing. Would not be better to name it PyDict_LookupItem or like? It may be worth to add also PyMapping_LookupItem for convenience. BTW, I think it is time to make _PyObject_LookupAttr public (if we agree about the name).

@vstinner
Copy link
Member Author

vstinner commented Jul 1, 2023

I replied on the PR, since the discussion is ongoing there.

vstinner added a commit to vstinner/cpython that referenced this issue Jul 12, 2023
* Add PyDict_GetItemRef() and PyDict_GetItemStringRef() functions.
* Add unit tests on the PyDict C API in test_capi.
vstinner added a commit to vstinner/cpython that referenced this issue Jul 12, 2023
* Add PyDict_GetItemRef() and PyDict_GetItemStringRef() functions.
  Add these functions to the stable ABI version 3.13.
* Add unit tests on the PyDict C API in test_capi.
vstinner added a commit to vstinner/cpython that referenced this issue Jul 19, 2023
* Add PyDict_GetItemRef() and PyDict_GetItemStringRef() functions.
  Add these functions to the stable ABI version 3.13.
* Add unit tests on the PyDict C API in test_capi.
vstinner added a commit that referenced this issue Jul 21, 2023
* Add PyDict_GetItemRef() and PyDict_GetItemStringRef() functions.
  Add these functions to the stable ABI version 3.13.
* Add unit tests on the PyDict C API in test_capi.
carljm added a commit to carljm/cpython that referenced this issue Jul 21, 2023
* main:
  pythongh-106004: Add PyDict_GetItemRef() function (python#106005)
serhiy-storchaka added a commit to serhiy-storchaka/cpython that referenced this issue Jul 25, 2023
serhiy-storchaka added a commit to serhiy-storchaka/cpython that referenced this issue Jul 25, 2023
serhiy-storchaka added a commit to serhiy-storchaka/cpython that referenced this issue Jul 25, 2023
@serhiy-storchaka
Copy link
Member

Tests will be added in #107179.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
topic-C-API type-bug An unexpected behavior, bug, or error
Projects
None yet
Development

No branches or pull requests

4 participants