Make validating assignment work properly with allowed extra #766

dmontagu · 2023-07-12T22:45:06Z

Fixes pydantic/pydantic#6613. No changes are necessary on the pydantic side, though I will open a PR adding a test.

Many of the changes here were just removing unnecessary .iter()s to get make to run without errors after I ran rustup update; it seems there are some new clippy lints.

dmontagu · 2023-07-12T22:46:20Z

src/validators/model_fields.rs

+        let new_extra = match &self.extra_behavior {
+            ExtraBehavior::Allow => {
+                let non_extra_data = PyDict::new(py);
+                self.fields.iter().for_each(|f| {
+                    let popped_value = PyAny::get_item(new_data, &f.name).unwrap();
+                    new_data.del_item(&f.name).unwrap();
+                    non_extra_data.set_item(&f.name, popped_value).unwrap();
+                });
+                let new_extra = new_data.copy()?;
+                new_data.clear();
+                new_data.update(non_extra_data.as_mapping())?;
+                new_extra.to_object(py)
+            }
+            _ => py.None(),
+        };


I feel there must be a better way to achieve this — @davidhewitt maybe you have some suggestions?

The idea — I'm trying to take new_data, which will have all key-value pairs for fields and extra, and split them into two dicts — new_data which has only field values, and new_extra which has everything else.

This looks pretty sound to me on the whole

You're modifying new_data so I don't think you need to create two new dicts. The challenge seems to be that you only have the list of known fields, so you are forced to remove them from new_data.

How about swapping the binding over like this:

Suggested change

let new_extra = match &self.extra_behavior {

ExtraBehavior::Allow => {

let non_extra_data = PyDict::new(py);

self.fields.iter().for_each(|f| {

let popped_value = PyAny::get_item(new_data, &f.name).unwrap();

new_data.del_item(&f.name).unwrap();

non_extra_data.set_item(&f.name, popped_value).unwrap();

});

let new_extra = new_data.copy()?;

new_data.clear();

new_data.update(non_extra_data.as_mapping())?;

new_extra.to_object(py)

}

_ => py.None(),

};

let (new_data, new_extra) = match &self.extra_behavior {

ExtraBehavior::Allow => {

// Move non-extra keys out of new_data, leaving just the extra in new_data

let non_extra_data = PyDict::new(py);

for field in &self.fields

let popped_value = PyAny::get_item(new_data, &field.name).unwrap();

new_data.del_item(&f.name).unwrap();

non_extra_data.set_item(&f.name, popped_value).unwrap();

}

(non_extra_data, new_data.to_object())

}

// FIXME do you need to throw if `new_data` contains any extra keys?

_ => (new_data, py.None()),

};

I think we check previously for any possible source of extra keys (and error), so I think we can ignore that potential issue here. So this looks good and we can drop the comment.

Just kidding, this breaks some tests because we currently assume in some places that validate_assignment does an in-place modification to the fields dict, but this makes it become the extra dict. I'm not sure what the ideal behavior is here but I am inclined to leave as it was before for now, and we can revisit this in a separate PR if/when it seems worthwhile.

codecov · 2023-07-12T22:52:25Z

Codecov Report

Merging #766 (d39c591) into main (f5b804b) will increase coverage by 0.00%.
The diff coverage is 97.05%.

Additional details and impacted files

@@           Coverage Diff           @@
##             main     #766   +/-   ##
=======================================
  Coverage   93.67%   93.67%           
=======================================
  Files          99       99           
  Lines       14270    14290   +20     
  Branches       25       25           
=======================================
+ Hits        13367    13386   +19     
- Misses        897      898    +1     
  Partials        6        6

Impacted Files	Coverage Δ
src/validators/model.rs	`98.03% <94.11%> (-0.35%)`	⬇️
src/validators/model_fields.rs	`98.42% <100.00%> (+0.10%)`	⬆️

Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update f5b804b...d39c591. Read the comment docs.

codspeed-hq · 2023-07-12T22:55:20Z

CodSpeed Performance Report

Merging #766 will not alter performance

_{Comparing validate-assignment-extra (d39c591) with main (f5b804b)}

Summary

✅ 126 untouched benchmarks

davidhewitt · 2023-07-13T08:59:39Z

Many of the changes here were just removing unnecessary .iter()s to get make to run without errors after I ran rustup update; it seems there are some new clippy lints.

You are probably building with nightly rust - I suggest downgrading to stable :)

(@adriangb had the same issue yesterday)

davidhewitt · 2023-07-13T09:15:15Z

src/validators/model_fields.rs

        let fields_set: &PySet = PySet::new(py, &[field_name.to_string()])?;
-        Ok((new_data, py.None(), fields_set.to_object(py)).to_object(py))
+        Ok((new_data.to_object(py), new_extra, fields_set.to_object(py)).to_object(py))


The return value of validate_assignment in _pydantic_core.pyi is dict[str, Any], but here we have a 3-tuple.

a) Should we change the return value of validate_assignment in _pydantic_core.pyi?
b) Should we change the definition of validate_assignment in trait Validator to return (PyObject, PyObject, PyObject)? This will enforce all the Rust implementation to behave correctly and avoid needing to go in and out of a Python tuple until we get to the top level.

EDIT based on what I see in the tests, I think the answer to both of these is "yes".

Should we change the definition of validate_assignment in trait Validator to return (PyObject, PyObject, PyObject)? This will enforce all the Rust implementation to behave correctly and avoid needing to go in and out of a Python tuple until we get to the top level.

Some historical perspective: validate_assignment and validate used to be the same thing, there was just a boolean flag being thrown around internally to differentiate the "mode". So the fact that the return is a PyObject instead of a tuple of PyObject is likely just an artifact of that original implementation.

This validate_assignment is also used for dataclasses, and I think we need it to be a dict[str, Any] for that case. I updated the type hint to be a union. I agree this probably deserves some cleanup but I think it's not as straightforward as "always return a tuple".

davidhewitt · 2023-07-13T09:32:04Z

src/validators/model_fields.rs

+        let new_extra = match &self.extra_behavior {
+            ExtraBehavior::Allow => {
+                let non_extra_data = PyDict::new(py);
+                self.fields.iter().for_each(|f| {
+                    let popped_value = PyAny::get_item(new_data, &f.name).unwrap();
+                    new_data.del_item(&f.name).unwrap();
+                    non_extra_data.set_item(&f.name, popped_value).unwrap();
+                });
+                let new_extra = new_data.copy()?;
+                new_data.clear();
+                new_data.update(non_extra_data.as_mapping())?;
+                new_extra.to_object(py)
+            }
+            _ => py.None(),
+        };


You're modifying new_data so I don't think you need to create two new dicts. The challenge seems to be that you only have the list of known fields, so you are forced to remove them from new_data.

How about swapping the binding over like this:

Suggested change

let new_extra = match &self.extra_behavior {

ExtraBehavior::Allow => {

let non_extra_data = PyDict::new(py);

self.fields.iter().for_each(|f| {

let popped_value = PyAny::get_item(new_data, &f.name).unwrap();

new_data.del_item(&f.name).unwrap();

non_extra_data.set_item(&f.name, popped_value).unwrap();

});

let new_extra = new_data.copy()?;

new_data.clear();

new_data.update(non_extra_data.as_mapping())?;

new_extra.to_object(py)

}

_ => py.None(),

};

let (new_data, new_extra) = match &self.extra_behavior {

ExtraBehavior::Allow => {

// Move non-extra keys out of new_data, leaving just the extra in new_data

let non_extra_data = PyDict::new(py);

for field in &self.fields

let popped_value = PyAny::get_item(new_data, &field.name).unwrap();

new_data.del_item(&f.name).unwrap();

non_extra_data.set_item(&f.name, popped_value).unwrap();

}

(non_extra_data, new_data.to_object())

}

// FIXME do you need to throw if `new_data` contains any extra keys?

_ => (new_data, py.None()),

};

davidhewitt · 2023-07-13T09:32:53Z

tests/validators/test_model_fields.py

    assert v.validate_assignment({'field_a': 'test'}, 'other_field', 456) == (
-        {'field_a': 'test', 'other_field': 456},
-        None,
+        {'field_a': 'test'},
+        {'other_field': 456},
        {'other_field'},
    )


Based on this it looks like the type annotation in _pydantic_core.pyi needs to be updated to a 3-tuple.

adriangb · 2023-07-13T17:30:15Z

Approved conditional on fixing feedback

Make validating assignment work properly with allowed extra

85589a4

dmontagu commented Jul 12, 2023

View reviewed changes

dmontagu mentioned this pull request Jul 12, 2023

Add xfailing test for pydantic-core PR 766 pydantic/pydantic#6641

Merged

davidhewitt reviewed Jul 13, 2023

View reviewed changes

adriangb approved these changes Jul 13, 2023

View reviewed changes

dmontagu added 3 commits July 13, 2023 11:37

Undo clippy-related changes

c91609a

Update _pydantic_core.pyi

37fa39f

Merge main

d39c591

dmontagu enabled auto-merge (squash) July 13, 2023 17:53

dmontagu merged commit 3f7c010 into main Jul 13, 2023
27 checks passed

dmontagu deleted the validate-assignment-extra branch July 13, 2023 17:59

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make validating assignment work properly with allowed extra #766

Make validating assignment work properly with allowed extra #766

dmontagu commented Jul 12, 2023 •

edited

Loading

dmontagu Jul 12, 2023 •

edited

Loading

adriangb Jul 12, 2023

davidhewitt Jul 13, 2023

dmontagu Jul 13, 2023 •

edited

Loading

dmontagu Jul 13, 2023 •

edited

Loading

codecov bot commented Jul 12, 2023 •

edited

Loading

codspeed-hq bot commented Jul 12, 2023 •

edited

Loading

davidhewitt commented Jul 13, 2023

davidhewitt Jul 13, 2023

adriangb Jul 13, 2023

dmontagu Jul 13, 2023

davidhewitt Jul 13, 2023

davidhewitt Jul 13, 2023

adriangb commented Jul 13, 2023

Make validating assignment work properly with allowed extra #766

Make validating assignment work properly with allowed extra #766

Conversation

dmontagu commented Jul 12, 2023 • edited Loading

dmontagu Jul 12, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dmontagu Jul 13, 2023 • edited Loading

Choose a reason for hiding this comment

dmontagu Jul 13, 2023 • edited Loading

Choose a reason for hiding this comment

codecov bot commented Jul 12, 2023 • edited Loading

Codecov Report

codspeed-hq bot commented Jul 12, 2023 • edited Loading

Merging #766 will not alter performance

Summary

davidhewitt commented Jul 13, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

adriangb commented Jul 13, 2023

dmontagu commented Jul 12, 2023 •

edited

Loading

dmontagu Jul 12, 2023 •

edited

Loading

dmontagu Jul 13, 2023 •

edited

Loading

dmontagu Jul 13, 2023 •

edited

Loading

codecov bot commented Jul 12, 2023 •

edited

Loading

codspeed-hq bot commented Jul 12, 2023 •

edited

Loading