Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add type_caster<PyObject> #4601

Merged
merged 21 commits into from
May 7, 2023
Merged

Conversation

rwgk
Copy link
Collaborator

@rwgk rwgk commented Apr 1, 2023

Description

This is to support automatic wrapping of APIs that make use of PyObject *. It is impractical to rewrite all such APIs to use py::handle or py::object (although that would of course be ideal).

Suggested changelog entry:

pybind11/type_caster_pyobject_ptr.h was added to support automatic wrapping of APIs that make use of `PyObject *`. This header needs to included explicitly (i.e. it is not included implicitly with pybind/pybind11.h).

Ralf W. Grosse-Kunstleve added 9 commits April 1, 2023 00:23
```
test_type_caster_pyobject_ptr.cpp:8:23: error: variable templates only available with ‘-std=c++14’ or ‘-std=gnu++14’ [-Werror]
    8 | static constexpr bool is_same_ignoring_cvref = std::is_same<detail::remove_cvref_t<T>, U>::value;
      |                       ^~~~~~~~~~~~~~~~~~~~~~
```
… (because of doubts about it actually being useful).
rwgk pushed a commit to rwgk/pybind11clif that referenced this pull request Apr 1, 2023
Ralf W. Grosse-Kunstleve added 4 commits April 1, 2023 12:55
…d add a related comment in cast.h.

No production code changes.

Make tests more sensitive by using `ValueHolder` instead of empty tuples and dicts.

Manual leak checks with `while True:` & top command repeated for all tests.
rwgk pushed a commit to google/pybind11clif that referenced this pull request Apr 5, 2023
* Snapshot of pybind/pybind11#4601 (squashed).

* Add new header file to CMakeLists.txt and tests/extra_python_package/test_files.py

* to-Pyton-cast: `return_value_policy::_clif_automatic` => take ownership

* `handle.inc_ref()` in `T cast(const handle &handle)` specialization for `PyObject *`

* Guard against accidental leaking by requiring `return_value_policy::reference` or `automatic_reference`

* Fix oversight in test (to resolve a valgrind leak detection error) and add a related comment in cast.h.

No production code changes.

Make tests more sensitive by using `ValueHolder` instead of empty tuples and dicts.

Manual leak checks with `while True:` & top command repeated for all tests.

* Add tests for interop with stl.h `list_caster`

(No production code changes.)

* Bug fix in test. Minor comment enhancements.
@rwgk rwgk marked this pull request as ready for review April 5, 2023 18:19
@rwgk rwgk requested a review from henryiii as a code owner April 5, 2023 18:19
@rwgk
Copy link
Collaborator Author

rwgk commented Apr 5, 2023

The equivalent of this PR was merged under google/pybind11clif#30021

@@ -1055,6 +1059,16 @@ T cast(const handle &handle) {
return T(reinterpret_borrow<object>(handle));
}

// Note that `cast<PyObject *>(obj)` increments the reference count of `obj`.
// This is necessary for the case that `obj` is a temporary.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this actually necessary? Seems a bit unintuitive here. When are situations when obj would be a temporary?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In a callback (functional.h):

That is using object::cast<PyObject *>() which winds up in py::cast<PyObject *>(handle):

This actually took me a while to discover and debug.

I don't see an easy way to make the code in functional.h behave differently.

But I'm also thinking:

cast<PyObject *>(something_that_generates_a_temporary_object())

with the temporary passed as handle is a really bad trap in general (especially if the cast is actually written as cast<T> buried in the guts of some library).

By adopting the inc_ref() convention we don't have to do anything special for callbacks and we avoid the trap.

Users have to learn once that the refcount is incremented, it always applies.

This is only for users that find it necessary to actually drop down to the level of dealing with raw PyObject *s. In that context, having to be super mindful about refcounts is a given, and careful leak checking of the tests is a standard practice for me at least (while True: & top is the most reliable btw).

Another way to look at it:

  • Risk of a accidental leaks (with the inc_ref()).
    vs
  • Risk of UB (segfault if you're lucky).

What tipped the scale in my mind was that the inc_ref() convention means nothing else needs to change, which is usually a good sign for changes like this (fitting something slightly different into an existing system).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is necessary incase obj is a temporary

Which is why I am advocating for the additional overload below. For C++ calls, we can determine if the object is a temporary (ie. it's an rvalue) and avoid having to increment it under those circumstances.

@rwgk
Copy link
Collaborator Author

rwgk commented Apr 6, 2023

I want to mention: in the meantime I discovered that this type_caster unlocks supporting numpy arrays with dtype='O'. It's incomplete work, although internally globally tested:

What you see there is all it took to add the 'O' support.

detail::enable_if_t<detail::is_same_ignoring_cvref<T, PyObject *>::value, int> = 0>
T cast(const handle &handle) {
return handle.inc_ref().ptr();
}
Copy link
Collaborator

@Skylion007 Skylion007 Apr 26, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the any case where you want to steal the ref? Ie. shouldn't you also add this? I do not think it will be used currently by your code but if that cast path is used in the future, better to have it.

Suggested change
}
}
template <typename T,
detail::enable_if_t<detail::is_same_ignoring_cvref<T, PyObject *>::value, int> = 0>
T cast(object &&obj) {
return obj.release().ptr();
}

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the any case where you want to steal the ref?

Not sure. I think the best way to answer this: can we come up with a meaningful test case?

My gut feeling is that stealing a Python reference like this is much more likely to backfire than be helpful. When dropping down to the bare-metal level of PyObject *, it's basically a given that the author of the code with the raw pointers needs to think 3+ times about all nuances of lifetime management. Keeping the mechanics here as simple as possible makes that easier.

Copy link
Collaborator

@Skylion007 Skylion007 Apr 27, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't the current test cases cover it? Semantically, it's equivalent to what is written above, the difference is that without this the py::object gets cast up to a handle, inc_ref'd by the caster, then dec_ref'ed as the py::object is destructed. All we are doing here is removing a useless increment / decrement.

Easy example/test case:

auto *pyobject_ptr = py::cast<PyObject*>(std::move(obj));

If we include this explicit cast for callbacks, some end will be silly enough to call it so we should make the semantics as useful/efficient as possible.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, a slight modification to the current one of the current test cases would definitely cover it (see below).

template <>
class type_caster<PyObject> {
public:
static constexpr auto name = const_name("PyObject *");
Copy link
Collaborator

@Skylion007 Skylion007 Apr 26, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Won't this const name break mypy/ type parsing? * has special meaning in docstrings and Python itself cannot return pointers (they will just appear as Python objects). Also returning a null pyobject ptr will cause a SystemError in Python.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As unsatisfying as it, I think the name should be an object caster to avoid breaking anything.

Copy link
Collaborator

@Skylion007 Skylion007 Apr 26, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will likely break any position args etc that come after it, right? If you really want to differentiate it (and create invalid typing) just have it be PyObject

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As unsatisfying as it, I think the name should be an object caster to avoid breaking anything.

Wow ... that's a tough one to decide.

From a high-level perspective, using object here definitely looks much better.

But it hides an important clue: beware, this is dealing with raw pointers.

But then again, to whom is that clue useful? The author of the code with the raw pointers already needs to be super careful when writing the code, i.e. doesn't need any extra clues. For everybody else is probably a not very enlightening surprise.

I'm leaning towards thinking using object as the name will be better overall.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Testing 5ccb893 now.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually one typing detail. Does this caster accept Py_None as meaning nullptr? That would be the one caveat and if did, it may better to list the type as typing.Optional

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean the load() step?

That's essentially just doing an inc_ref and nothing else:

    bool load(handle src, bool) {
        value = reinterpret_borrow<object>(src);
        return true;
    }


static handle cast(PyObject *src, return_value_policy policy, handle /*parent*/) {
if (src == nullptr) {
throw error_already_set();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is an error guaranteed here? Python may raise a system exception, but I think that might be triggered before even entering the caster. Worried about the case where a nullptr is fed in somehow but no error is set.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is an error guaranteed here?

Good question: No, but error_already_set will handle this gracefully:

if (!m_type) {
pybind11_fail("Internal error: " + std::string(called)
+ " called while "
"Python error indicator not set.");
}

(This was one of the things I made as clean/clear as possible while working on #1895.)

});
m.def("cast_to_pyobject_ptr", [](py::handle obj) {
auto rc1 = obj.ref_count();
auto *ptr = py::cast<PyObject *>(obj);
Copy link
Collaborator

@Skylion007 Skylion007 Apr 27, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can test my custom caster by just making the py::handle a py::object and calling std::move(obj) for the cast. (if you avoid the ref count access below).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI I just pushed a commit that expands the source code comment, before seeing your latest comment here.

I need to look/think more.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The more I think it, the more I think it's a good feature / caster to have and I do not see any reason not to include the object&& overload we already do it for other object subclasses anyhow further down in the file.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't find it easy to unlock that optimization: 2e5a699

Do you have ideas for achieving the same outcome with less code?

The added code complexity is a bit at odds with the really minor runtime optimization, in any real-world application.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rwgk I am just bringing the complexity / runtime optimization inline with the other caster methods. If adding a single overload can signficantly improve performance and remove unnecessary refcount calls, I see no reason why not to include it.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, thanks for the review!

Ralf W. Grosse-Kunstleve added 3 commits April 27, 2023 07:43
The original suggestion leads to `error: call to 'cast' is ambiguous` (full error message below), therefore SFINAE guarding is needed.

```
clang++ -o pybind11/tests/test_type_caster_pyobject_ptr.os -c -std=c++17 -fPIC -fvisibility=hidden -O0 -g -Wall -Wextra -Wconversion -Wcast-qual -Wdeprecated -Wundef -Wnon-virtual-dtor -Wunused-result -Werror -isystem /usr/include/python3.10 -isystem /usr/include/eigen3 -DPYBIND11_STRICT_ASSERTS_CLASS_HOLDER_VS_TYPE_CASTER_MIX -DPYBIND11_ENABLE_TYPE_CASTER_ODR_GUARD_IF_AVAILABLE -DPYBIND11_TEST_BOOST -Ipybind11/include -I/usr/local/google/home/rwgk/forked/pybind11/include -I/usr/local/google/home/rwgk/clone/pybind11/include /usr/local/google/home/rwgk/forked/pybind11/tests/test_type_caster_pyobject_ptr.cpp
In file included from /usr/local/google/home/rwgk/forked/pybind11/tests/test_type_caster_pyobject_ptr.cpp:1:
In file included from /usr/local/google/home/rwgk/forked/pybind11/include/pybind11/functional.h:12:
In file included from /usr/local/google/home/rwgk/forked/pybind11/include/pybind11/pybind11.h:13:
In file included from /usr/local/google/home/rwgk/forked/pybind11/include/pybind11/detail/class.h:12:
In file included from /usr/local/google/home/rwgk/forked/pybind11/include/pybind11/attr.h:14:
/usr/local/google/home/rwgk/forked/pybind11/include/pybind11/cast.h:1165:12: error: call to 'cast' is ambiguous
    return pybind11::cast<T>(std::move(*this));
           ^~~~~~~~~~~~~~~~~
/usr/local/google/home/rwgk/forked/pybind11/include/pybind11/functional.h:109:70: note: in instantiation of function template specialization 'pybind11::object::cast<_object *>' requested here
                return hfunc.f(std::forward<Args>(args)...).template cast<Return>();
                                                                     ^
/usr/local/google/home/rwgk/forked/pybind11/include/pybind11/functional.h:103:16: note: in instantiation of member function 'pybind11::detail::type_caster<std::function<_object *(int)>>::load(pybind11::handle, bool)::func_wrapper::operator()' requested here
        struct func_wrapper {
               ^
/usr/local/google/home/rwgk/forked/pybind11/include/pybind11/cast.h:1456:47: note: in instantiation of member function 'pybind11::detail::type_caster<std::function<_object *(int)>>::load' requested here
        if ((... || !std::get<Is>(argcasters).load(call.args[Is], call.args_convert[Is]))) {
                                              ^
/usr/local/google/home/rwgk/forked/pybind11/include/pybind11/cast.h:1434:50: note: in instantiation of function template specialization 'pybind11::detail::argument_loader<const std::function<_object *(int)> &, int>::load_impl_sequence<0UL, 1UL>' requested here
    bool load_args(function_call &call) { return load_impl_sequence(call, indices{}); }
                                                 ^
/usr/local/google/home/rwgk/forked/pybind11/include/pybind11/pybind11.h:227:33: note: in instantiation of member function 'pybind11::detail::argument_loader<const std::function<_object *(int)> &, int>::load_args' requested here
            if (!args_converter.load_args(call)) {
                                ^
/usr/local/google/home/rwgk/forked/pybind11/include/pybind11/pybind11.h:101:9: note: in instantiation of function template specialization 'pybind11::cpp_function::initialize<(lambda at /usr/local/google/home/rwgk/forked/pybind11/tests/test_type_caster_pyobject_ptr.cpp:50:9), _object *, const std::function<_object *(int)> &, int, pybind11::name, pybind11::scope, pybind11::sibling, pybind11::return_value_policy>' requested here
        initialize(
        ^
/usr/local/google/home/rwgk/forked/pybind11/include/pybind11/pybind11.h:1163:22: note: in instantiation of function template specialization 'pybind11::cpp_function::cpp_function<(lambda at /usr/local/google/home/rwgk/forked/pybind11/tests/test_type_caster_pyobject_ptr.cpp:50:9), pybind11::name, pybind11::scope, pybind11::sibling, pybind11::return_value_policy, void>' requested here
        cpp_function func(std::forward<Func>(f),
                     ^
/usr/local/google/home/rwgk/forked/pybind11/tests/test_type_caster_pyobject_ptr.cpp:48:7: note: in instantiation of function template specialization 'pybind11::module_::def<(lambda at /usr/local/google/home/rwgk/forked/pybind11/tests/test_type_caster_pyobject_ptr.cpp:50:9), pybind11::return_value_policy>' requested here
    m.def(
      ^
/usr/local/google/home/rwgk/forked/pybind11/include/pybind11/cast.h:1077:3: note: candidate function [with T = _object *, $1 = 0]
T cast(object &&obj) {
  ^
/usr/local/google/home/rwgk/forked/pybind11/include/pybind11/cast.h:1149:1: note: candidate function [with T = _object *]
cast(object &&object) {
^
1 error generated.
```
@rwgk rwgk merged commit 90312a6 into pybind:master May 7, 2023
@rwgk rwgk deleted the type_caster_PyObject_master branch May 7, 2023 17:15
@github-actions github-actions bot added the needs changelog Possibly needs a changelog entry label May 7, 2023
rwgk pushed a commit to google/pybind11clif that referenced this pull request May 7, 2023
git merge smart_holder (after pybind/pybind11#4601 was merged)
copybara-service bot pushed a commit to pybind/pybind11_abseil that referenced this pull request May 18, 2023
…ters.h

While working on pybind/pybind11#4674 I came to realize that this include is not needed here.

Note however that the pybind11/cast.h changes under pybind/pybind11#4601 are still needed, therefore pybind11_abseil still requires current pybind11 master.

PiperOrigin-RevId: 533182724
@rwgk rwgk removed the needs changelog Possibly needs a changelog entry label May 22, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants