Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gh-111262: Add PyDict_Pop() function [with default value] #111939

Closed
wants to merge 6 commits into from

Conversation

vstinner
Copy link
Member

@vstinner vstinner commented Nov 10, 2023

Change the API of the internal _PyDict_Pop_KnownHash() function to return an int.


📚 Documentation preview 📚: https://cpython-previews--111939.org.readthedocs.build/

@vstinner
Copy link
Member Author

This PR is based on PR #111263. I credited @scoder.

@vstinner
Copy link
Member Author

cc @serhiy-storchaka @encukou

Doc/c-api/dict.rst Outdated Show resolved Hide resolved
Doc/c-api/dict.rst Outdated Show resolved Hide resolved
Copy link
Member

@serhiy-storchaka serhiy-storchaka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please show how it will be used in existing code instead of the old API? It will help to see what design is better.

For now, both uses of _PyDict_Pop_KnownHash() ignore the returning value. It means either that the design is not optimal or that the caller does not fully utilize the API.

@vstinner
Copy link
Member Author

vstinner commented Nov 12, 2023

@serhiy-storchaka:

Please show how it will be used in existing code instead of the old API? It will help to see what design is better.

Are you suggesting to return PyObject* and not provide the information if the key existed in the key, or if the default value was used?

You asked for this information:

It is difficult to distinguish two cases: return value removed from the dict and returning the increfed default value. If the dict can contain arbitrary values, you need to create a special sentinel value.

Do you want to suggest a different API?

@serhiy-storchaka:

For now, both uses of _PyDict_Pop_KnownHash() ignore the returning value. It means either that the design is not optimal or that the caller does not fully utilize the API.

Well, I wasn't sure if this PR should change the existing code "too much". I just pushed a change to use _PyDict_Pop_KnownHash() return value. As PyDict_GetItemRef(..., &value): you can ignore the return value and always use value to decide how to handle the 3 cases (error, not found, found).

@vstinner
Copy link
Member Author

I replaced usage of private _PyDict_Pop() with new public PyDict_Pop() to show how its return value can be useful to check for errors.

vstinner and others added 5 commits November 13, 2023 00:31
Change the API of the internal _PyDict_Pop_KnownHash() function
to return an int.

Co-Authored-By: Stefan Behnel <[email protected]>
Co-authored-by: Antoine Pitrou <[email protected]>
@vstinner vstinner changed the title gh-106320: Add PyDict_Pop() function gh-111262: Add PyDict_Pop() function Nov 12, 2023
@vstinner
Copy link
Member Author

Oh, this PR is related to issue gh-111262. Before I mentioned the wrong issue number.

@encukou
Copy link
Member

encukou commented Nov 13, 2023

This looks OK to me API-wise, but as always, preferably wait for the C-API workgroup to review it. Please don't add it to the limited API before such a review.

Do we need PyDict_PopString as well?

@scoder
Copy link
Contributor

scoder commented Nov 13, 2023

My gut feeling tells me that the most important use case for this is a fire-and-forget "delete an item" kind of operation, where I do not care whether the key is found or not. This is at least a little less efficient than it should be, because users still need to pass an explicit default value (probably Py_None) and then PyDict_Pop() needs to incref it just to let the user decref it afterwards.

Why do we need to raise an exception at all? Why not return the default value regardless, i.e. return NULL if a user did not provide a default?

Options I see:

  • return 1 with a new reference as value if the key was found, 0 if the key is not found and the xincref-ed default is returned (and users probably know whether they passed NULL or not), and -1 if an exception was raised with the value set to NULL.
  • do not let the users provide a default at all, return 1 with a new reference if the value was found, 0 with NULL value if not found, and -1 with NULL value for an exception.

I actually like the second option best, it seems quite simple. If users really want an exception to be raised, raising a KeyError with the key in their hands is entirely trivial.

@scoder
Copy link
Contributor

scoder commented Nov 13, 2023

That written, returning three different codes is somewhat annoying since it means that users need to put the return code in a variable first, rather than just writing if (PyDict_Pop() == -1) goto error;.

However, at least with my second option, the returned dict value is actually a proper point of distinction as well, so users could safely use this simple "error on -1" pattern and then check whether the value is NULL or not, or simply xdecref it if they don't care. Allowing both usages (check all three return codes vs. check error code and value) seems quite a user friendly API.

@pitrou
Copy link
Member

pitrou commented Nov 13, 2023

  • do not let the users provide a default at all, return 1 with a new reference if the value was found, 0 with NULL value if not found, and -1 with NULL value for an exception.

+1 from me as well. Using a default value with dict.pop is, in my experience, only useful to avoid a KeyError.

@serhiy-storchaka
Copy link
Member

Thank you Victor for showing how it can be used in the code (at least in our code). We now can see advantages and drawbacks of different designs.

I was going to write the same as @scoder. He just reads my mind. Most of uses cases here are trivial "delete without raising KeyError". The default value does not matter in this case. KeyError can be raised in only two cases where the default value can be NULL: in the implementations of the pop() method in dict and OrderedDict.

I propose to implement (perhaps in a separate PR) the variant of PyDict_Pop() without the default_value parameter. It should set value to NULL and return 0 without raising KeyError if no item found. Then compare two variants. Is there a significant difference in complexity or the number of lines of code?

@vstinner
Copy link
Member Author

Ok, and now something completely different: PR gh-112028

  • Remove default_value parameter
  • result argument can be NULL: it's common that the removed value is not used in the caller, see the PR
  • Don't raise KeyError if the key is not present

@vstinner
Copy link
Member Author

@encukou:

Do we need PyDict_PopString as well?

If needed, it can be added later, but in Python code base, it doesn't seem to be needed. So far, nobody asked for such API.

@encukou:

but as always, preferably wait for the C-API workgroup to review it. Please don't add it to the limited API before such a review.

The Steering Council has been asked to take a decision on PEP 731 -- C API Working Group Charter 3 weeks ago, and so far, no decision was taken. I don't think that the lack of such working group should hold limited C API change. Can't we take decisions as a community until such group exists?

@vstinner vstinner changed the title gh-111262: Add PyDict_Pop() function gh-111262: Add PyDict_Pop() function [with default value] Nov 13, 2023
@serhiy-storchaka
Copy link
Member

Do we need PyDict_PopString as well?

It can be used at least in two places in the CPython codebase. Even if it was not used, it is worth to add it, because all similar functions have a *String pair. String keys are pretty common, and you can expect that C strings are used more in third-party code if they prefer simplicity over performance.

@vstinner
Copy link
Member Author

@serhiy-storchaka:

It can be used at least in two places in the CPython codebase.

Would you mind to point me where it could be used?

@encukou
Copy link
Member

encukou commented Nov 13, 2023

Can't we take decisions as a community until such group exists?

I don't think we should. This PR and others like it are setting precedents for new future C-API, and we do not agree on the details of the direction we want to take. We should be extra careful that the PRs adhere the guidelines we want to set, and we'll want to improve the guidelines based on the individual PRs.

cc @gvanrossum @zooba @iritkatriel

@pitrou
Copy link
Member

pitrou commented Nov 13, 2023

This PR and others like it are setting precedents for new future C-API

I'm not sure I understand this argument, since the API proposed in this PR just follows established and uncontroversial conventions.

Taking a quick look at the guidelines being discussed, the only one that seems to apply is
Define guideline for parameter types: PyObject vs specific type
, but the consensus there was to continue using PyObject*, which this PR does.

@encukou
Copy link
Member

encukou commented Nov 13, 2023

I filed capi-workgroup/api-evolution#40, see you there for generic discussion about one aspect of proposed new API.

@zooba
Copy link
Member

zooba commented Nov 13, 2023

Can't we take decisions as a community until such group exists?

Theoretically yes, but since we know that the group is coming, it's a bit rude to try and race ahead of it just in case it disagrees with your design.

As I proposed on the other PR, any new API going in now shouldn't be treated as precedent for guidelines going forward. It may end up as a wart, but that's the risk you take adding new things in this period. Help get the WG established quickly and things will start to be unblocked (and definitely don't add anything to the stable ABI in the meantime).

@vstinner
Copy link
Member Author

(and definitely don't add anything to the stable ABI in the meantime).

Ok.

By the way, 26 functions were added to Python 3.13 stable ABI. Should they be removed until the C API Working Group is created and decides what to do with them?

Theoretically yes, but since we know that the group is coming, it's a bit rude to try and race ahead of it just in case it disagrees with your design.

This PR is related to the removal of private functions in Python 3.13. Apparently, users are now eagger to test Python 3.13 as soon as alpha1, reported issues, and so I created #111481 and #112026 to try to mitigate issues.

Are you suggesting to freeze the C API for now?

@zooba
Copy link
Member

zooba commented Nov 13, 2023

By the way, 26 functions were added to Python 3.13 stable ABI.

These were all existing functions though, right? I'm not aware of anything brand new being added - we've all been saying to hold off on brand new APIs being marked long-term stable for months now.

You still get to use your judgement. If you judge that there's only one reasonable way to implement an API, then go ahead and take the risk. Just be aware that we are actively trying to plan how to plan the C API into the future, and so unilateral design decisions risk making a mockery of that plan, and by extension, the entire project. You're aware of the problems and proposed solutions that have been raised (see the issue repositories at https://github.com/capi-workgroup/ if not), and so anything falling in those areas is likely a good candidate for bringing to a discussion before committing us to support the design for the next 10+ years.

@vstinner
Copy link
Member Author

I created Set guidelines to add APIs to the limited C API and to the stable ABI. I suggest to continue the discussion there. I already modified this PyDict_Pop() PR to not add it to the limited API.

@vstinner
Copy link
Member Author

I close this PR: #112028 was merged with a different API.

@vstinner vstinner closed this Nov 14, 2023
@vstinner vstinner deleted the dict_pop branch November 14, 2023 13:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants