Implicit conversion from Python string to C++ enum type #483

stukowski · 2016-11-04T15:15:32Z

Let me propose an extension:
Sometimes it would be nice if a string could be passed from a Python script to a C++ function where an enum parameter is expected. For example:

struct Pet {
    enum Kind {
        Dog = 0,
        Cat
    };
    Pet(Kind type) : type(type) { }
    Kind type;
};

Binding code:

auto pet = py::class_<Pet>(m, "Pet")
    .def(py::init<Pet::Kind>());

py::enum_<Pet::Kind>(pet, "Kind")
    .value("Dog", Pet::Kind::Dog)
    .value("Cat", Pet::Kind::Cat);

Python:

>>> p = Pet(Pet.Cat) # Works
>>> p = Pet("Cat")   # Would like to make this work too

In principle it is possible to add optional support for this implicit conversion from a Python string to the enum type. I propose to add the following method to the pybind11::enum_ class:

template <typename Type> class enum_ : public class_<Type> {
    ...

    /// Set up atomatic Python string -> C++ enum conversion.
    enum_& enable_conversion_from_string() {
        auto entries = m_entries;
	def("__init__", [entries](Type& v, const char* str) {
            for(const auto& item : *entries) {
                if(!strcmp(str, item.second)) {
                    v = (Type)item.first;
                    return;
                }
            }
            std::string tname = typeid(Type).name();
            detail::clean_type_id(tname);
            throw value_error("\"" + std::string(str) + "\" is not a valid value for enum type " + tname);
        });
	implicitly_convertible<str, Type>();
        return *this;
    }
    ...
}

This would allow writing binding codes like this:

py::enum_<Pet::Kind>(pet, "Kind")
    .value("Dog", Pet::Kind::Dog)
    .value("Cat", Pet::Kind::Cat)
    .enable_conversion_from_string();

There is only one problem with this solution: The exception thrown by the conversion constructor when the string does not match any of the enum entries is swallowed by pybind11. Instead, a general cast error is raised because the conversion from string to the enum type failed.

I am wondering what you think about this idea. Does it make sense to offer an automatic conversion like this? At least in my project there is one situation where an automatic conversion from a string is desirable.

By the way: The pybind11::enum_ class constructor registers two __init__ methods: One initializing the value via assignment, the other via in-place new. I think one of them is redundant. Furthermore, I noticed that the std::unordered_map allocated by the constructor is never freed. Does it make sense to use a shared_ptr<unordered_map> here so that the map gets destroyed eventually?

The text was updated successfully, but these errors were encountered:

jagerman · 2016-11-04T16:24:02Z

The second __init__ indeed looks weird; and moreover I don't think it'll ever actually be invoked. It was added in 6fb4849 along with other changes to fix a problem, but I don't think the second constructor was actually the fix here. (I also don't think the second __init__ will ever be called, since pybind looks for things to call based on their signatures, and these have identical signatures).

The non-deleted pointer is intentional: registration of classes/enums is permanent, and that pointer needs to be alive (because __repr__ uses it) as long as the enum is registered, i.e. forever. The enum_ instance itself, however, only persists during the initialization code: but its effects, which depend on that unordered map, remain forever.

jagerman · 2016-11-04T17:02:35Z

One other suggestion: the use of const char* as the value type in the map in the current code (and even more so in your code) seems to be complicating the code; changing it to std::string would be a decent cleanup. That would change __repr__ to:

        def("__repr__", [name, entries](Type value) -> std::string {
            auto it = entries->find((UnderlyingType) value);
            return name + (it == entries->end() ? std::string(".???") : "." + it->second);
        });

and would change your code to take a std::string, then use regular == comparison instead of the strcmp.

Another minor comment:

std::string tname = typeid(Type).name();
detail::clean_type_id(tname);
throw value_error("\"" + std::string(str) + "\" is not a valid value for enum type " + tname);

the get-name-and-clean-it is already available via pybind11::type_id<Type>(), so, with the change of str to std::string, you can replace all of the above with:

throw value_error("\"" + str + "\" is not a valid value for enum type " + type_id<T>());

Edit: actually, type_id<T>() isn't really what you want here: you actually want the name of the Python type, so you'd need to store the name given to the constructor in a private member, then use a capture-by-value of it in your added lambda.

jagerman · 2016-11-04T17:11:41Z

(I also suggest you code this up, add some tests for it, and submit a PR--it'll make it a bit easier to discuss the code and see it in action).

wjakob · 2016-11-06T17:23:53Z

all agreed -- please resubmit the modified version as a PR, and we can take it from there. I will close this ticket since this is a feature request and not a pybind11 issue.

stukowski · 2016-11-07T14:06:23Z

Thank you all for the suggestions. I'll submit a PR as soon as I find time to code this up.

As I already mentioned, I don't know yet how to generate a meaningful error from this conversion constructor. pybind11 swallows any exception thrown by the constructor function and generates a general casting error message instead. So the nice error message generated by my function never makes it to the Python caller. If you have any idea how to inform the user that his input string did not match any of the registered enum values, let me know.

@jagerman: My comment on the non-deleted map pointer was based on my (wrong) assumption that the enum_ instance (and with it all lamdas it has registered) would get destroyed on Py_Finalize(). I now learned that this is not the case and the memory leak is intentional. Of course, there is not point in using a shared_ptr then.

wjakob · 2016-11-07T14:12:02Z

pybind11 swallows any exception thrown by the constructor function and generates a general casting error message instead.

This sounds quite mysterious to me. Why don't you submit your PR with that issue still present, and we can take a look?

YannickJadoul · 2017-05-10T14:45:14Z

Has there been any progress on this pull request? If @stukowski is not planning to make one, I might try to implement the proposed solution such that I'm able to use it in my project.

TD22057 · 2017-09-28T22:37:52Z

+1 - I'd really like this feature as well. Our current python code generator does this and it's really useful. I can't find an open or closed PR for this anywhere.

ezyang · 2022-09-05T19:44:25Z

Using #1176 I was able to achieve this effect by defining a type_caster directly on the enum type while inheriting from type_caster_base. Example:

template <>
struct type_caster<c10::DispatchKey> : public type_caster_base<c10::DispatchKey> {
  using base = type_caster_base<c10::DispatchKey>;
  c10::DispatchKey tmp;
 public:
  bool load(handle src, bool convert) {
    if (base::load(src, convert)) {
      return true;
    } else if (py::isinstance(src, py::module_::import("builtins").attr("str"))) {
      tmp = c10::parseDispatchKey(py::cast<std::string>(src));
      value = &tmp;
      return true;
    } 
    return false;
  }   

  static handle cast(
      c10::DispatchKey src,
      return_value_policy policy,
      handle parent) {
    return base::cast(src, policy, parent);
  } 
};

Unfortunately, you will have to somehow implement the string parsing yourself. I had a helper C++ function which I used.

I'm not entirely sure this doesn't leak, as I'm doing a dynamic allocation in order to get a value of the correct form.

EDIT: Unfortunately this leaks memory

EDIT 2: Updated with a non-leaking version, though IDK how kosher it is. I know that converters have some safety to avoid returning temporaries so hoping this is kosher.

There are a number of situations where we have non-backend kernels (e.g., CompositeImplicitAutograd, batching rules) which we would like to port to Python, but we have no way to integrate these ports with the overall system while using preexisting C++ registrations otherwise. This PR changes that by introducing a Python dispatcher (which can have its own kernels directly in Python), which can be interpose over ordinary C++ dispatch. The ingredients: * We introduce a new PythonDispatcher dispatch key, that has the same tenor as FuncTorchDynamicLayerFrontMode: it works by getting triggered *before* every other dispatch key in the dispatch key, and shunting to a Python implementation * The Python dispatcher is a per-interpreter global object that is enabled/disabled via the guard EnablePythonDispatcher/DisablePythonDispatcher. We don't make it compositional as I have no idea what a compositional version of this feature would look like. Because it is global, we don't need to memory manage it and so I use a simpler SafePyHandle (newly added) to control access to this pointer from non-Python C++. Like `__torch_dispatch__`, we use PyInterpreter to get to the Python interpreter to handle the dispatch. * I need to reimplement dispatch table computation logic in Python. To do this, I expose a lot more helper functions for doing computations on alias dispatch keys and similar. I also improve the pybind11 handling for DispatchKey so that you can either accept the pybind11 bound enum or a string; this simplifies our binding code. See pybind/pybind11#483 (comment) for how this works; the technique is generally useful. * I need to be able to call backend fallbacks. I do this by permitting you to call at a dispatch key which doesn't have a kernel for the operator; if the kernel doesn't exist, we check the backend fallback table instead. One missing piece is how to integrate this with voz's infra for Python operators. Signed-off-by: Edward Z. Yang <ezyangfb.com> [ghstack-poisoned]

…patcher" Signed-off-by: Edward Z. Yang <ezyangfb.com> From ezyang's original PR: There are a number of situations where we have non-backend kernels (e.g., CompositeImplicitAutograd, batching rules) which we would like to port to Python, but we have no way to integrate these ports with the overall system while using preexisting C++ registrations otherwise. This PR changes that by introducing a Python dispatcher (which can have its own kernels directly in Python), which can be interpose over ordinary C++ dispatch. The ingredients: We introduce a new PythonDispatcher dispatch key, that has the same tenor as FuncTorchDynamicLayerFrontMode: it works by getting triggered before every other dispatch key in the dispatch key, and shunting to a Python implementation The Python dispatcher is a per-interpreter global object that is enabled/disabled via the guard EnablePythonDispatcher/DisablePythonDispatcher. We don't make it compositional as I have no idea what a compositional version of this feature would look like. Because it is global, we don't need to memory manage it and so I use a simpler SafePyHandle (newly added) to control access to this pointer from non-Python C++. Like __torch_dispatch__, we use PyInterpreter to get to the Python interpreter to handle the dispatch. I need to reimplement dispatch table computation logic in Python. To do this, I expose a lot more helper functions for doing computations on alias dispatch keys and similar. I also improve the pybind11 handling for DispatchKey so that you can either accept the pybind11 bound enum or a string; this simplifies our binding code. See pybind/pybind11#483 (comment) for how this works; the technique is generally useful. I need to be able to call backend fallbacks. I do this by permitting you to call at a dispatch key which doesn't have a kernel for the operator; if the kernel doesn't exist, we check the backend fallback table instead. Signed-off-by: Edward Z. Yang <ezyangfb.com> [ghstack-poisoned]

Signed-off-by: Edward Z. Yang <ezyangfb.com> From ezyang's original PR: There are a number of situations where we have non-backend kernels (e.g., CompositeImplicitAutograd, batching rules) which we would like to port to Python, but we have no way to integrate these ports with the overall system while using preexisting C++ registrations otherwise. This PR changes that by introducing a Python dispatcher (which can have its own kernels directly in Python), which can be interpose over ordinary C++ dispatch. The ingredients: We introduce a new PythonDispatcher dispatch key, that has the same tenor as FuncTorchDynamicLayerFrontMode: it works by getting triggered before every other dispatch key in the dispatch key, and shunting to a Python implementation The Python dispatcher is a per-interpreter global object that is enabled/disabled via the guard EnablePythonDispatcher/DisablePythonDispatcher. We don't make it compositional as I have no idea what a compositional version of this feature would look like. Because it is global, we don't need to memory manage it and so I use a simpler SafePyHandle (newly added) to control access to this pointer from non-Python C++. Like __torch_dispatch__, we use PyInterpreter to get to the Python interpreter to handle the dispatch. I need to reimplement dispatch table computation logic in Python. To do this, I expose a lot more helper functions for doing computations on alias dispatch keys and similar. I also improve the pybind11 handling for DispatchKey so that you can either accept the pybind11 bound enum or a string; this simplifies our binding code. See pybind/pybind11#483 (comment) for how this works; the technique is generally useful. I need to be able to call backend fallbacks. I do this by permitting you to call at a dispatch key which doesn't have a kernel for the operator; if the kernel doesn't exist, we check the backend fallback table instead. Signed-off-by: Edward Z. Yang <ezyangfb.com> [ghstack-poisoned]

@ezyang

Signed-off-by: Edward Z. Yang <ezyangfb.com> From @ezyang's original PR: There are a number of situations where we have non-backend kernels (e.g., CompositeImplicitAutograd, batching rules) which we would like to port to Python, but we have no way to integrate these ports with the overall system while using preexisting C++ registrations otherwise. This PR changes that by introducing a Python dispatcher (which can have its own kernels directly in Python), which can be interpose over ordinary C++ dispatch. The ingredients: We introduce a new PythonDispatcher dispatch key, that has the same tenor as FuncTorchDynamicLayerFrontMode: it works by getting triggered before every other dispatch key in the dispatch key, and shunting to a Python implementation The Python dispatcher is a per-interpreter global object that is enabled/disabled via the guard EnablePythonDispatcher/DisablePythonDispatcher. We don't make it compositional as I have no idea what a compositional version of this feature would look like. Because it is global, we don't need to memory manage it and so I use a simpler SafePyHandle (newly added) to control access to this pointer from non-Python C++. Like __torch_dispatch__, we use PyInterpreter to get to the Python interpreter to handle the dispatch. I need to reimplement dispatch table computation logic in Python. To do this, I expose a lot more helper functions for doing computations on alias dispatch keys and similar. I also improve the pybind11 handling for DispatchKey so that you can either accept the pybind11 bound enum or a string; this simplifies our binding code. See pybind/pybind11#483 (comment) for how this works; the technique is generally useful. I need to be able to call backend fallbacks. I do this by permitting you to call at a dispatch key which doesn't have a kernel for the operator; if the kernel doesn't exist, we check the backend fallback table instead. Signed-off-by: Edward Z. Yang <[email protected]> Pull Request resolved: #84826 Approved by: https://github.com/ezyang

Signed-off-by: Edward Z. Yang <ezyangfb.com> pick e219ce8 Integrate python-only dispatching, make pyop tests pass Test fixes, mypy stuff Lints and defines small matmul change Make matmul test happy small test fix fix tos More test fixes Save PythonDispatcherTLS to ThreadLocalState Signed-off-by: Edward Z. Yang <[email protected]> [ghstack-poisoned] WIP: Python Dispatcher integration with C++ dispatcher Signed-off-by: Edward Z. Yang <ezyangfb.com> pick e219ce8 Integrate python-only dispatching, make pyop tests pass Test fixes, mypy stuff Lints and defines small matmul change Make matmul test happy small test fix fix tos More test fixes Save PythonDispatcherTLS to ThreadLocalState Signed-off-by: Edward Z. Yang <ezyangfb.com> Make py_impl less error prone, make decompose use py_kernels Wire up matmul and broadcast_tensors for real Signed-off-by: Edward Z. Yang <[email protected]> fake file object for better backtrace Signed-off-by: Edward Z. Yang <[email protected]> Make _dispatch_key_name and parse more useful Signed-off-by: Edward Z. Yang <[email protected]> Reorder Merge branch 'gh/voznesenskym/12/head' of github.com:pytorch/pytorch into gh/voznesenskym/12/head Fix typo expect pass Mode registration Better comment Better mode registration Fix incorrect assertion purge matmul, fix string Test removal Lints, small fixes fix attr mypy pybind hack... lint better check More linter stuff Please work, linter More linter woes Update base for Update on "Python Dispatcher integration with C++ dispatcher" Signed-off-by: Edward Z. Yang <ezyangfb.com> From ezyang's original PR: There are a number of situations where we have non-backend kernels (e.g., CompositeImplicitAutograd, batching rules) which we would like to port to Python, but we have no way to integrate these ports with the overall system while using preexisting C++ registrations otherwise. This PR changes that by introducing a Python dispatcher (which can have its own kernels directly in Python), which can be interpose over ordinary C++ dispatch. The ingredients: We introduce a new PythonDispatcher dispatch key, that has the same tenor as FuncTorchDynamicLayerFrontMode: it works by getting triggered before every other dispatch key in the dispatch key, and shunting to a Python implementation The Python dispatcher is a per-interpreter global object that is enabled/disabled via the guard EnablePythonDispatcher/DisablePythonDispatcher. We don't make it compositional as I have no idea what a compositional version of this feature would look like. Because it is global, we don't need to memory manage it and so I use a simpler SafePyHandle (newly added) to control access to this pointer from non-Python C++. Like __torch_dispatch__, we use PyInterpreter to get to the Python interpreter to handle the dispatch. I need to reimplement dispatch table computation logic in Python. To do this, I expose a lot more helper functions for doing computations on alias dispatch keys and similar. I also improve the pybind11 handling for DispatchKey so that you can either accept the pybind11 bound enum or a string; this simplifies our binding code. See pybind/pybind11#483 (comment) for how this works; the technique is generally useful. I need to be able to call backend fallbacks. I do this by permitting you to call at a dispatch key which doesn't have a kernel for the operator; if the kernel doesn't exist, we check the backend fallback table instead. Signed-off-by: Edward Z. Yang <ezyangfb.com> [ghstack-poisoned] WIP: Python Dispatcher integration with C++ dispatcher Signed-off-by: Edward Z. Yang <ezyangfb.com> pick e219ce8 Integrate python-only dispatching, make pyop tests pass Test fixes, mypy stuff Lints and defines small matmul change Make matmul test happy small test fix fix tos More test fixes Save PythonDispatcherTLS to ThreadLocalState Signed-off-by: Edward Z. Yang <ezyangfb.com> Ed fixes my linter woes Cleanup Merge branch 'gh/voznesenskym/12/head' of github.com:pytorch/pytorch into gh/voznesenskym/12/head Fix clowny stuff Fix dynamo failure by unhacking op names WIP: Python Dispatcher integration with C++ dispatcher Signed-off-by: Edward Z. Yang <ezyangfb.com> pick e219ce8 Integrate python-only dispatching, make pyop tests pass Test fixes, mypy stuff Lints and defines small matmul change Make matmul test happy small test fix fix tos More test fixes Save PythonDispatcherTLS to ThreadLocalState Signed-off-by: Edward Z. Yang <ezyangfb.com> ghstack-source-id: 2fca1533c653eb29c0ae1d300a6de5041420af4d Pull Request resolved: #84826 Ed fixes my linter woes ghstack-source-id: 01aa659a09a607af44583ac3813ba39817d9b420 Pull Request resolved: #84965 Cleanup ghstack-source-id: 29b0515b75590fed65e4a2763be1348454df09fe Pull Request resolved: #84966 WIP: Python Dispatcher integration with C++ dispatcher Signed-off-by: Edward Z. Yang <ezyangfb.com> pick e219ce8 Integrate python-only dispatching, make pyop tests pass Test fixes, mypy stuff Lints and defines small matmul change Make matmul test happy small test fix fix tos More test fixes Save PythonDispatcherTLS to ThreadLocalState Signed-off-by: Edward Z. Yang <ezyangfb.com> ghstack-source-id: f7c212074fdd5fe7758b528acb1cab51d02e463e Pull Request resolved: #84826 Make py_impl less error prone, make decompose use py_kernels Wire up matmul and broadcast_tensors for real Signed-off-by: Edward Z. Yang <[email protected]> fake file object for better backtrace Signed-off-by: Edward Z. Yang <[email protected]> Make _dispatch_key_name and parse more useful Signed-off-by: Edward Z. Yang <[email protected]> Reorder WIP: Python Dispatcher integration with C++ dispatcher Signed-off-by: Edward Z. Yang <ezyangfb.com> pick e219ce8 Integrate python-only dispatching, make pyop tests pass Test fixes, mypy stuff Lints and defines small matmul change Make matmul test happy small test fix fix tos More test fixes Save PythonDispatcherTLS to ThreadLocalState Signed-off-by: Edward Z. Yang <[email protected]> [ghstack-poisoned] Fix typo expect pass Mode registration Better comment Better mode registration Fix incorrect assertion purge matmul, fix string Test removal Lints, small fixes fix attr mypy pybind hack... lint better check More linter stuff Please work, linter More linter woes Fix clowny stuff Fix dynamo failure by unhacking op names

@ezyang

Signed-off-by: Edward Z. Yang <ezyangfb.com> From @ezyang's original PR: There are a number of situations where we have non-backend kernels (e.g., CompositeImplicitAutograd, batching rules) which we would like to port to Python, but we have no way to integrate these ports with the overall system while using preexisting C++ registrations otherwise. This PR changes that by introducing a Python dispatcher (which can have its own kernels directly in Python), which can be interpose over ordinary C++ dispatch. The ingredients: We introduce a new PythonDispatcher dispatch key, that has the same tenor as FuncTorchDynamicLayerFrontMode: it works by getting triggered before every other dispatch key in the dispatch key, and shunting to a Python implementation The Python dispatcher is a per-interpreter global object that is enabled/disabled via the guard EnablePythonDispatcher/DisablePythonDispatcher. We don't make it compositional as I have no idea what a compositional version of this feature would look like. Because it is global, we don't need to memory manage it and so I use a simpler SafePyHandle (newly added) to control access to this pointer from non-Python C++. Like __torch_dispatch__, we use PyInterpreter to get to the Python interpreter to handle the dispatch. I need to reimplement dispatch table computation logic in Python. To do this, I expose a lot more helper functions for doing computations on alias dispatch keys and similar. I also improve the pybind11 handling for DispatchKey so that you can either accept the pybind11 bound enum or a string; this simplifies our binding code. See pybind/pybind11#483 (comment) for how this works; the technique is generally useful. I need to be able to call backend fallbacks. I do this by permitting you to call at a dispatch key which doesn't have a kernel for the operator; if the kernel doesn't exist, we check the backend fallback table instead. Signed-off-by: Edward Z. Yang <[email protected]> Pull Request resolved: #84826 Approved by: https://github.com/ezyang

wjakob closed this as completed Nov 6, 2016

henryiii mentioned this issue Sep 29, 2017

WIP: Adding string constructor for enum #1122

Open

jskimko mentioned this issue Feb 14, 2020

string to enum conversion regression after enum rewrite? #2114

Open

ezyang mentioned this issue Sep 6, 2022

WIP: Python Dispatcher integration with C++ dispatcher pytorch/pytorch#84551

Closed

voznesenskym mentioned this issue Sep 10, 2022

Python Dispatcher integration with C++ dispatcher pytorch/pytorch#84826

Closed

rwgk mentioned this issue Feb 9, 2023

FWD pybind11 google/pybind11clif#483

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implicit conversion from Python string to C++ enum type #483

Implicit conversion from Python string to C++ enum type #483

stukowski commented Nov 4, 2016

jagerman commented Nov 4, 2016

jagerman commented Nov 4, 2016 •

edited

Loading

jagerman commented Nov 4, 2016 •

edited

Loading

wjakob commented Nov 6, 2016

stukowski commented Nov 7, 2016

wjakob commented Nov 7, 2016

YannickJadoul commented May 10, 2017

TD22057 commented Sep 28, 2017

ezyang commented Sep 5, 2022 •

edited

Loading

Implicit conversion from Python string to C++ enum type #483

Implicit conversion from Python string to C++ enum type #483

Comments

stukowski commented Nov 4, 2016

jagerman commented Nov 4, 2016

jagerman commented Nov 4, 2016 • edited Loading

jagerman commented Nov 4, 2016 • edited Loading

wjakob commented Nov 6, 2016

stukowski commented Nov 7, 2016

wjakob commented Nov 7, 2016

YannickJadoul commented May 10, 2017

TD22057 commented Sep 28, 2017

ezyang commented Sep 5, 2022 • edited Loading

jagerman commented Nov 4, 2016 •

edited

Loading

jagerman commented Nov 4, 2016 •

edited

Loading

ezyang commented Sep 5, 2022 •

edited

Loading