Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC] Render typed iterators in docstrings (alternative) #2371

Draft
wants to merge 8 commits into
base: master
Choose a base branch
from

Conversation

sizmailov
Copy link
Contributor

@sizmailov sizmailov commented Aug 7, 2020

This is an alternative to #2244 without backward incompatibility drawback.

Interestingly, binary size is not affected by this change

# g++ (Ubuntu 8.3.0-6ubuntu1) 8.3.0
$ du -b pybind11_tests.cpython-37m-x86_64-linux-gnu.so-*
3244544 pybind11_tests.cpython-37m-x86_64-linux-gnu.so-036add2
3244544 pybind11_tests.cpython-37m-x86_64-linux-gnu.so-227170d
$ md5sum pybind11_tests.cpython-37m-x86_64-linux-gnu.so-*
1578ddd4a55f3a826e82d528af51657f  pybind11_tests.cpython-37m-x86_64-linux-gnu.so-036add2
01aac0b3081c91427d052103719087d4  pybind11_tests.cpython-37m-x86_64-linux-gnu.so-227170d

Examples (compare.cpp):

master
__iter__(self: example.VectorInt) -> Iterator
__iter__(self: example.MapTupleToComplex) -> Iterator
items(self: example.MapTupleToComplex) -> Iterator
this PR
__iter__(self: example.VectorInt) -> Iterator[int]
__iter__(self: example.MapTupleToComplex) -> Iterator[Tuple[int, str]]
items(self: example.MapTupleToComplex) -> Iterator[Tuple[Tuple[int, str], complex]]
compare.cpp
#include <map>
#include <complex>

#include "pybind11/pybind11.h"
#include "pybind11/stl_bind.h"
#include "pybind11/embed.h" 

namespace py = pybind11;

using map_tuple_to_complex = std::map<std::tuple<int,std::string>, std::complex<double>>;
PYBIND11_MAKE_OPAQUE(std::vector<int>);
PYBIND11_MAKE_OPAQUE(map_tuple_to_complex);

PYBIND11_EMBEDDED_MODULE(example, m) {
  py::bind_vector<std::vector<int>>(m, "VectorInt");
  py::bind_map<map_tuple_to_complex>(m, "MapTupleToComplex");
}

int main() {
  py::scoped_interpreter guard{};
  py::exec(R"(
      from example import *
      print(VectorInt.__iter__.__doc__, end="")
      print(MapTupleToComplex.__iter__.__doc__, end="")
      print(MapTupleToComplex.items.__doc__, end="")
)");
}

This commit introduces minor breaking change: `make_iterator` and `make_key_iterator`
now return `iterator_state` instance instead of `py::iterator`.
It doesn't affect regular use of those functions (immediate return from __iter__ lambda),
but requires changes in user code with implicit assignments/conversions, e.g.:

py::iterator it = make_iterator(...); # requires explicit py::cast()
deprecate make_iterator,make_key_iterator
typename Sentinel,
typename ValueType = decltype(*std::declval<Iterator>()),
typename... Extra>
[[deprecated("Superseded by make_iterator_ng")]]
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Deprecation of long-standing function is a hostile action to users, since it would lead to warnings in user code, which would force everyone to use _ng version of functions.

This can be viewed as the cost of making the change fully backward-compatible.
Having two similar functions is not a good thing, so one should be marked as deprecated in favor of superior one.

I'm not 100% happy about this. Is there other viable option?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think if I were to consider this (using a suffix and deprecate the old) vs. #2244 (updating the function), my money would be still be on #2244 :(

In terms of incompatibility, I think that could be solved by using a (deprecated?) implicit conversion from py::detail::iterator_state<...> to py::iterator.

However, per @wjakob's points about complexity, I'm not sure if this PR avoids the complexity from before, due to the deprecation (which may lead to another name swap + deprecation from make_iterator_ng back to make_iterator?).

The best option (that I can think of) is to mirror the setup used in numpy.h for py::array and py::array_t:

Let iterator be the fully type-erased Python interface type, while iterator_t<ValueType> can add a (minimal) additional "C++ type-friendly" filter on top of iterator (e.g. the casting name, forced casting, etc.). If you do that, then perhaps you only need iterator_t<ValueType>, rather than carrying around type info in detail::iterator_state<Iterator, Sentinel, bool, Policy> and use all of the decltype(...) stuff? Then you can update the stl.h stuff to use iterator_t<ValueType> vs. just iterator or propogating types from there.
It may induce some additional non-functional indirection, but requires less type change?

Thoughts? @bstaletic or @wjakob if y'all are available?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think if I were to consider this (using a suffix and deprecate the old) vs. #2244 (updating the function), my money would be still be on #2244 :(

In terms of incompatibility, I think that could be solved by using a (deprecated?) implicit conversion from py::detail::iterator_state<...> to py::iterator.

However, per @ wjakob's points about complexity, I'm not sure if this PR avoids the complexity from before, due to the deprecation (which may lead to another name swap + deprecation from make_iterator_ng back to make_iterator?).

I agree with you up to this point.

The best option [...]

Your idea here doesn't sound bad to me, but I believe @wjakob should have the last word.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also quite like the idea of fixing the actual problem, rather than adding onto the confusion. Then again, I can see the point that this "only" fixes the docstring (and you could already use py::arg_v to do so I believe?).

At any rate, if we're thinking about making a breaking 3.0.0 release, it would be cool if we could include a fixed version. A potential route would be to first use a new name + deprecation, then rename in a (distant future) pybind11 3.0.0?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then again, I can see the point that this "only" fixes the docstring

This PR only fixes annotation of return type. Functions with typed iterator argument still need to use type-erased py::iterator. I don't think py::arg_v can help here.

I like the idea of py::iterator_t<ValueType>. The only issue I think of is overloads support. Dispatcher needs to tell the difference between f(py::iterator_t<int>) vs f(py::iterator_t<bool>). I think it's not possible to tell which is which until the actual iteration happens, unlike in case of py::array_t. This issue can be "avoided" by banning overloads that different only in py::iterator_t argument(s). Alternatively, we can make a scaffold to extract first element (which still might be not enough) to dispatch and then iterate using mocked-up iterator, but this sounds like an implementation nightmare and potential nasty debug sessions for users.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR only fixes annotation of return type. Functions with typed iterator argument still need to use type-erased py::iterator. I don't think py::arg_v can help here.

Right, jumped to conclusions too quickly, there! We don't actually have a py::ret_v, I suppose, to support a manual/custom return type annotation?

I like the idea of py::iterator_t<ValueType>. The only issue I think of is overloads support.

I thought we already have something like this (where the internal types are only checked at runtime), but I can't find it. This will always be a problem, though, so I wouldn't be opposed to just add it and document that it won't do anything for overloads (but it would do the casting for you, once you ask the next element). pybind11 has more corners where overloads aren't perfect and order matters, so ...
Extracting the first element would be reasonably unexpected, I believe. And as you say: it wouldn't provide any guarantees, so let's have users do that themselves, if they really want to disambiguate.

@YannickJadoul
Copy link
Collaborator

Interestingly, binary size is not affected by this change

How many instantiations of py::iterator are in this test? I would assume the binary size to go up if many different iterator types are instantiated?

@sizmailov
Copy link
Contributor Author

How many instantiations of py::iterator are in this test? I would assume the binary size to go up if many different iterator types are instantiated?

Can't tell from release binary. Everything related to pybnd11 seems to be optimized completely 😕

$ readelf -Ws pybind11_tests.cpython-37m-x86_64-linux-gnu.so | c++filt | grep pybind
   387: 00000000001f2170   153 FUNC    GLOBAL DEFAULT   12 std::_Function_base::_Base_manager<pybind11::detail::type_caster<std::function<void ()>, void>::load(pybind11::handle, bool)::func_wrapper>::_M_manager(std::_Any_data&, std::_Any_data const&, std::_Manager_operation)
   437: 000000000027a500    38 OBJECT  GLOBAL DEFAULT   14 typeinfo name for pybind11::detail::broadcast_trivial
   445: 00000000002485b0   153 FUNC    GLOBAL DEFAULT   12 std::_Function_base::_Base_manager<pybind11::detail::type_caster<std::function<int (int)>, void>::load(pybind11::handle, bool)::func_wrapper>::_M_manager(std::_Any_data&, std::_Any_data const&, std::_Manager_operation)
   456: 0000000000248470   153 FUNC    GLOBAL DEFAULT   12 std::_Function_base::_Base_manager<pybind11::detail::type_caster<std::function<void (int)>, void>::load(pybind11::handle, bool)::func_wrapper>::_M_manager(std::_Any_data&, std::_Any_data const&, std::_Manager_operation)
   457: 0000000000316800    16 OBJECT  GLOBAL DEFAULT   20 typeinfo for pybind11::detail::broadcast_trivial
   467: 0000000000260ac0   343 FUNC    GLOBAL DEFAULT   12 PyInit_pybind11_tests

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants