Derive Macro for IntoPyDict #3350

vidur2 · 2023-07-28T18:10:21Z

Create derive macro for IntoPyDict trait. Derives all primitives and std::collections except for BTreeMap and HashMap. Currently does not work on enums or tuple structs, but support will be added later.

mejrs

Thanks for your contribution :)

Before doing an in depth review, I'd like to see the following addressed:

Can you give a motivating example for this? Is this useful to you personally, and how would you use it? Think in terms of another user seeing this and asking themselves "should i implement this for my structs? Is this useful?"
Don't use &str and/or String. Instead use syn's types and compose code with the quote! macro, like our other macro code does.
Don't panic in macro code, the user experience with this is really bad. Instead use syn::Error and syn::Result.

tests/test_intopydict.rs

davidhewitt

Thank you for working on this functionality.

Overall I'm ok with having this, I've seen lots of manual implementations of to_dict so having a way to make to make this easier is welcome. An alternative is #1703 which could add a py_dict! macro, that's still more boilerplate than a derive.

The biggest design question in my head is what happens when a #[pyclass] implements IntoPyDict and is included in a sub-structure. e.g.

#[pyclass]
#[derive(IntoPyDict)]
struct A { }

#[pyclass]
#[derive(IntoPyDict)]
struct B {
    pub a: A,
}

Should IntoPyDict do a deep conversion into Python dictionaries, or should it be shallow? In the example above, should IntoPyDict for B result in {'a': {} } or {'a': <object of type A>}?

Is there a right answer, or should this be configurable?

How does this macro interact with other PyO3 options? I can imagine at least #[pyo3(crate = ...)] and #[pyo3(name = "...")] would affect this.

vidur2 · 2023-07-30T16:34:32Z

Python 3.7 added the dataclasses module which enabled users of the language to write classes in a similar syntax to rust structs or typescript types. In addition to being syntactic sugar over the init dunder, it also came with an asdict function which could turn any dataclass into a dictionary. If a field of a dataclass was a dataclass, then it would turn that field into a dataclass. If it was just a normal class then it would do a shallow copy. I suggest we follow this format (though I do not have it implemented this way yet). Since in every case, the composed struct must implement IntoPyDict, it would be a deepcopy. If someone wanted a shallow copy, they could implement IntoPyDict on the struct thesemselves. The official documentation for dataclass's asdict can be found here. I haven't tested the macro interaction for this.

vidur2 · 2023-07-30T18:04:21Z

I don't think it would be affected by pyo3 options because it directly takes the field names of the class as the dictionary keys. Is this a wrong designwise?

vidur2 · 2023-07-30T20:18:41Z

Im having a little trouble understanding what #[pyo3(crate=...)] does. Do you mind explaining it to me?

…t used

davidhewitt

The purpose of #[pyo3(crate = "foo")] is to redirect the macro to use foo instead of pyo3 as the top-level symbol. There are several reasons to do this; within test_hygiene we use this to run the macros within pyo3's lib test suite. Users might have pyo3 as a transient dependency e.g. foo::pyo3.

The way I suggest the macro should handle it is emit an outer const _: () = { use pyo3 as _pyo3; <impl> }; structure, where <impl> is the derive macro and then can use _pyo3 everywhere internally.

At the moment this implementation is significantly different from the rest of our macro machinery. That's not the end of the world, but I do think there are a few edge cases which will need to be addressed which would be helped by using the existing machinery.

Also, the implementation definitely needs to move from parsing type names as strings to emitting code which gets the trait solver to work out the final details. Users can freely rename types or add new types etc, so parsing type names is generally something we use as a last resort rather than something core to the implementation.

davidhewitt · 2023-08-02T06:56:36Z

pyo3-macros-backend/src/intopydict.rs

+const COL_NAMES: [&str; 8] = [
+    "BTreeSet",
+    "BinaryHeap",
+    "Vec",
+    "HashSet",
+    "LinkedList",
+    "VecDeque",
+    "BTreeMap",
+    "HashMap",
+];


I feel quite strongly that instead of special-casing this macro on names of types the implementation should emit code based on traits. In this case, if we go with the proposal of deep IntoPyDict, we should use autoref-specialization which prefers IntoPyDict if implemented and otherwise falls back on IntoPy<PyObject>.

Would this apply to collections as well (ie Vecs, Sets etc)? These would get stored as a PyObject instead of a PyList?

And how do you recommend I use autoref specialization? I don't have much experience with it. Is there a crate which makes it easier?

A separate crate usually doesn't help because abstracting the pattern away is often more complex than just applying it.

In this case, the general technique is described at https://github.com/dtolnay/case-studies/blob/master/autoref-specialization/README.md and you can find an example of its application in PyO3's existing macros by looking for usages of the type PyClassImplCollector defined at

pyo3/src/impl_/pyclass.rs

Line 109 in ae982b8

pub struct PyClassImplCollector<T>(PhantomData<T>);

and for example its relation to the trait PyClassNewTextSignature defined at

pyo3/src/impl_/pyclass.rs

Line 995 in ae982b8

pub trait PyClassNewTextSignature<T> {

Should I add the specialization feature as a crate attribute?

No; the specialization feature is incomplete and unsound. Autoref specialization is a technique which works on stable Rust to get a specialization-like result for some limited cases.

There seems to be some issues with reference moving when I try to do autoref specialization. Sorry it took me a while for this, I was finishing up an internship

tests/test_intopydict.rs

davidhewitt · 2023-08-02T07:00:38Z

pyo3-macros-backend/src/intopydict.rs

+
+    for (i, char_val) in binding.chars().enumerate() {
+        if char_val == ',' && stack.is_empty() {
+            if binding[start..i].starts_with('#') {


I think you might want to take a look at take_pyo3_options to see how the rest of the crate parses this.

tests/test_intopydict.rs

mejrs

My comment regarding strings still stands. You shouldn't be parsing a single String, &str or char in this code. Use the appropriate syn types instead. In general: when writing macros if you feel you have to use strings you are either using quote/syn wrong or you just haven't found the right function/method yet.

mejrs · 2023-08-04T13:21:59Z

pyo3-macros/src/lib.rs

+    check_type(&ast).unwrap();
+    let ident = ast.ident.into_token_stream();
+    let clause_wrapped = ast.generics.where_clause.clone();
+    let mut where_clause: TokenStream2 = TokenStream2::new();
+    let generic_params: TokenStream2 = parse_generics(&ast.generics).parse().unwrap();
+    let generics = ast.generics.into_token_stream();


These and the other unwraps should use into_compile_error to create errors instead.

mejrs · 2023-08-04T13:56:47Z

pyo3-macros-backend/src/intopydict.rs

+    }
+}
+
+pub fn check_type(input: &DeriveInput) -> syn::Result<()> {


This function should not exist. Instead match on the DeriveInput and forward to a function that handles structs specifically. Something like...

pub fn handle_deriveinput(input: &DeriveInput) -> syn::Result<???> { match input.data { syn::Data::Struct(strukt @ syn::Datastruct { fields: syn::Fields::Named(_), ..}) => { handle_struct(strukt) } _ => Err(syn::Error::new( // span, "IntoPyDict derive is only possible for structs with named fields", )) } } pub fn handle_struct(input: &syn::DataStruct) -> syn::Result<???>{ // ... }

Instead of having a bunch of "check" functions, parse, don't validate. Did you spot the bug/oversight in your code that this code does not have? What about structs without fields? 🙃

mejrs · 2023-08-04T14:01:13Z

pyo3-macros-backend/src/intopydict.rs

+
+#[derive(Debug, Clone)]
+pub struct Pyo3DictField {
+    name: String,


This should probably be an Ident or &Ident, not String.

mejrs · 2023-08-04T14:04:03Z

pyo3-macros/src/lib.rs

+    let ident = ast.ident.into_token_stream();
+    let clause_wrapped = ast.generics.where_clause.clone();
+    let mut where_clause: TokenStream2 = TokenStream2::new();
+    let generic_params: TokenStream2 = parse_generics(&ast.generics).parse().unwrap();
+    let generics = ast.generics.into_token_stream();
+
+    if let Some(clause) = clause_wrapped {
+        where_clause = clause.into_token_stream();
+    }
+    let mut dict_fields: Pyo3Collection = Pyo3Collection(Vec::new());
+    for token in item {
+        let token_stream: syn::__private::TokenStream = token.into();
+        dict_fields += parse_macro_input!(token_stream as Pyo3Collection);
+    }


Don't go back and forth between syn types, tokenstreams and strings here and elsewhere. Parse it into DeriveInput, once, then just write functions that take that or the types contained in it.

mejrs · 2023-08-04T14:14:29Z

pyo3-macros-backend/src/intopydict.rs

+pub fn parse_generics(generics: &Generics) -> String {
+    if !generics.params.is_empty() {
+        let mut generics_parsed = "<".to_string();
+
+        for param in &generics.params {
+            match param {
+                syn::GenericParam::Lifetime(lt) => {
+                    generics_parsed += ("'".to_string() + &lt.lifetime.ident.to_string()).as_str()
+                }
+                syn::GenericParam::Type(generic_type) => {
+                    generics_parsed += generic_type.ident.to_string().as_str()
+                }
+                syn::GenericParam::Const(const_type) => {
+                    generics_parsed +=
+                        ("const".to_string() + const_type.ident.to_string().as_str()).as_str()
+                }
+            }
+
+            generics_parsed += ",";
+        }
+
+        generics_parsed = generics_parsed[0..generics_parsed.len() - 1].to_string();
+        generics_parsed += ">";
+        generics_parsed
+    } else {
+        String::new()
+    }
+}


Syn has a function for doing this.

Thank you! Ill keep that in mind

vidur2 · 2023-08-14T03:02:52Z

Im finding that I cannot parse the pyo3 options when the struct is annotated with a #[pyclass] attribute macro. Is there an alterantive way to do this within pyo3's macro system instead of that provided by syn

davidhewitt · 2023-08-14T05:26:31Z

Not totally sure what the problem you're encountering is, probably it's related to the fact that one of the macros will be running on the result of the other.

The solution might be to detect this case and run both expansions at the same time.

davidhewitt · 2024-02-06T08:38:33Z

Thanks for the contributions here. It seems like this work has stalled, and I'm doing a bit of housekeeping this morning so I'm going to close this PR for now.

I think there are still open design questions and the implementation needed further work. If anyone is interested in continuing this please comment and we can discuss design.

Vidur Modgil and others added 21 commits July 27, 2023 21:15

better

f64aa86

better

c35b598

better

077e470

finished non generic macro

c87345d

macro exapnsion works

533992d

cleanup

b77e9e1

Merge branch 'PyO3:main' into main

2362538

more stuff

b86bf14

tests better?

66a3442

committing files

4056990

pre merge

c4bf9db

Merge remote-tracking branch 'unforked/main'

774864e

added to newsfragments directory

b9215d4

initial

3d8e257

updated trait

fd1d6f4

added formatting

40c7636

better?

6e5601b

formatted

e0264c3

changed crate structure

509942e

formatting

25f28ca

better?

70468f3

mejrs requested changes Jul 28, 2023

View reviewed changes

tests/test_intopydict.rs Outdated Show resolved Hide resolved

Vidur Modgil added 8 commits July 28, 2023 15:25

better?

a047c3d

trying to fix ci

afb6f12

more ci fixes

67891b6

fixed tmp issues

31be92d

better?

e6f18ba

removed clippy error

b26e64d

changed to pass msvri test

7e08475

works

33f09a3

Vidur Modgil added 3 commits July 28, 2023 20:17

fixed clippy issue

3065ba6

put final body into tokens

14f8b01

better

8a933b7

vidur2 requested a review from mejrs July 29, 2023 14:28

Vidur Modgil added 9 commits July 29, 2023 10:53

error handling for unallowed types

be9a6a6

more extensive error handling

81bd87d

fixed double generic parsing issue

31b4e7f

edited error message to be clearer

08366df

added more error handling

d014295

added compile test

dd594b8

added test suite for invalid

3f35396

fixed invalidation tests

896f483

reorg of testing suit

8385b1b

davidhewitt reviewed Jul 30, 2023

View reviewed changes

added support for py name option

f89ba6a

Vidur Modgil added 4 commits July 30, 2023 17:26

less convoluted

1415231

map code better

ee19677

Merge branch 'main' of https://github.com/PyO3/pyo3

2adfbee

removed duplicate test compile fail

20c90c4

vidur2 requested a review from davidhewitt July 31, 2023 13:37

removed need to imoprt pyo3::types::IntoPyDict outside of macro if no…

d8bc842

…t used

davidhewitt requested changes Aug 2, 2023

View reviewed changes

mejrs reviewed Aug 4, 2023

View reviewed changes

davidhewitt closed this Feb 6, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Derive Macro for IntoPyDict #3350

Derive Macro for IntoPyDict #3350

vidur2 commented Jul 28, 2023 •

edited

Loading

mejrs left a comment

davidhewitt left a comment •

edited

Loading

vidur2 commented Jul 30, 2023 •

edited

Loading

vidur2 commented Jul 30, 2023

vidur2 commented Jul 30, 2023 •

edited

Loading

davidhewitt left a comment

davidhewitt Aug 2, 2023

vidur2 Aug 2, 2023

vidur2 Aug 3, 2023

adamreichold Aug 3, 2023 •

edited

Loading

vidur2 Aug 3, 2023

davidhewitt Aug 4, 2023

vidur2 Aug 6, 2023

davidhewitt Aug 2, 2023

mejrs left a comment •

edited

Loading

mejrs Aug 4, 2023

mejrs Aug 4, 2023

mejrs Aug 4, 2023

mejrs Aug 4, 2023

mejrs Aug 4, 2023

vidur2 Aug 5, 2023

vidur2 commented Aug 14, 2023

davidhewitt commented Aug 14, 2023

davidhewitt commented Feb 6, 2024

Derive Macro for IntoPyDict #3350

Derive Macro for IntoPyDict #3350

Conversation

vidur2 commented Jul 28, 2023 • edited Loading

mejrs left a comment

Choose a reason for hiding this comment

davidhewitt left a comment • edited Loading

Choose a reason for hiding this comment

vidur2 commented Jul 30, 2023 • edited Loading

vidur2 commented Jul 30, 2023

vidur2 commented Jul 30, 2023 • edited Loading

davidhewitt left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

adamreichold Aug 3, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mejrs left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

vidur2 commented Aug 14, 2023

davidhewitt commented Aug 14, 2023

davidhewitt commented Feb 6, 2024

vidur2 commented Jul 28, 2023 •

edited

Loading

davidhewitt left a comment •

edited

Loading

vidur2 commented Jul 30, 2023 •

edited

Loading

vidur2 commented Jul 30, 2023 •

edited

Loading

adamreichold Aug 3, 2023 •

edited

Loading

mejrs left a comment •

edited

Loading