-
-
Notifications
You must be signed in to change notification settings - Fork 475
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(project): Deserializable
derive macro
#1564
Conversation
✅ Deploy Preview for biomejs canceled.
|
@@ -18,7 +18,6 @@ formatter_extraneous_field.json:3:3 deserialize ━━━━━━━━━━ | |||
- enabled | |||
- formatWithErrors | |||
- indentStyle | |||
- indentSize |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I let the macro filter out deprecated keys, since even though they're technically accepted, I don't think they're a very useful suggestion to the user.
}) | ||
DeserializationDiagnostic::new( | ||
"You enabled the VCS integration, but you didn't specify a client.", | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The check here is an example of a use case that could be solved through a DeserializableValidator
trait. If a struct contains a #[deserializable(with_validator)]
annotation, we could let the generated deserializer call into a validate()
function that would have the ability to reject the deserialized instance. But I haven't built that yet :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you notice other code that requires a validation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, for instance LineWidth
which checks whether the u16
is within its own MIN
and MAX
. And also a lot of the types in the generated rules.rs
have a similar kind of validation logic.
mod linter; | ||
mod organize_imports; | ||
mod overrides; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think if the rules
submodule below gets cleaned up, which requires letting the generated rule structs also tap into the macro, it might make sense to just get rid of the configuration::parse
module altogether. The few remaining manual deserializers could easily be implemented along the structs they implement.
CodSpeed Performance ReportMerging #1564 will not alter performanceComparing Summary
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At first glance it looks good! I haven't taken the time to check the implementation yet. I left some suggestions for the docs.
@@ -183,200 +206,6 @@ assert_eq!(deserialized, None); | |||
assert_eq!(diagnostics..len(), 1); | |||
``` | |||
|
|||
### Deserializing an enumeration of values |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should keep this doc. This is an important resource that allows understanding how this works under the hood. This also allows implementing an exotic deserializer that is not covered by the derive macro.
We could change the title to Implementing a deserializer for an enum
. We could also add a note in a first paragraph telling that we now have a derive macro to generate a deserializer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mostly deleted this because I don't think it makes sense to maintain the code snippets so they stay in sync with the macro. Rust Analyzer offers inspection of macros, so you can easily peek under the hood if you want to.
But indeed I also removed some explanatory text, which I will try to reintroduce in way that makes more sense with the new situation. I left the section for deserializing unions in place, and I just notice there is actually now an easier way to implement that one too (by checking is_type()
instead of implementing a visitor), so I will update that one and try to retrofit most of the explanatory text there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be nice to keep at least one example to manually implement the deserialisation, like serde
does.
} | ||
``` | ||
|
||
### Deserializing a struct |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could change the title to Implementing a deserializer for a struct
. We could also add a note in a first paragraph telling that we now have a derive macro to generate a deserializer.
allowed_invalid_roles: Vec<String>, | ||
allow_invalid_roles: Vec<String>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why the name of the property was changed? This will break user config?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The JSON property it maps to is allowInvalidRoles
, so I changed the name in the struct to match it. This should actually prevent breaking the user config.
Some extra comments / suggestions:
It is common in Rust of using the newtype idom to add data validation. If we have only a few valdiation case, we could certainly use this idiom and manually implement struct NonEmptyString(String);
impl NonEmptyString {
fn new(inner: String) -> Self {
assert!(!inner.is_empty(), "the string should not be empty");
Self(inner)
}
}
impl Deserializable for NonEmptyString {
fn deserialize(value: &impl DeserializableValue, name: &str, diagnostics: &mut Vec<DeserializationDiagnostic>,) -> Option<Self> {
let result = Deserializable::deserialize(value, name, diagnostics);
if result.is_some_and(|result| result.is_empty()) {
// emit diagnostic
}
result
}
} |
I don't think that would be suitable for generating the type of diagnostics we want.
That would work too, although I actually quite like the attribute because it makes the behavior explicit. When a deserializer is implemented manually, there is a lot going on, and in a case like this it's easy to overlook why it's implemented manually. I only noticed it after tests started failing when I accidentally broke the functionality...
Yeah, I agree it's a good idea if we solve these through a more generic validation mechanism.
That's the plan!
I do think there's quite a few though, especially when considering the generated rule groups. It also feels a bit off that if you want validation, you need to implement a deserializer. Ideally it would like to keep the concerns separated, with deserialization covered as much by the macro as we can, while leaving validation to manual implementation. |
The Rust standard Marking something deprecated will generate an error in clippy |
You are right. This should certainly be separated.
|
I have to admit that I am on the fence here. The In the short-to-mean term, I would like to get rid of Maybe we could keep
I prefer the newtype idiom because it seems more idiomatic in Rust. This encodes constraints that the data must satisfy. #[derive(Deserializable)]
#[deserializable(validate = Options::validate)]
struct Options {
name: String,
}
impl Options {
fn validate(&self, diagnostics: Vec<...>) {
if self.name.is_empty() {
// emit diagnostic
}
}
} |
Maybe if we use a generic for the options in But yeah, if we don't need
Yeah, I think I will indeed go with the I was thinking of introducing a |
I can add that in this PR, I think, assuming CI won't complain :) |
Hmm, bummer. I added the |
That's not so simple because not every rule has options.
How the derive macro could know if the type implements the |
I was thinking to just add an attribute for that. |
I'll revert the |
Alright, when all is green it's time to merge :) |
Summary
This implements the
Deserializable
derive macro and replaces most of the manualDeserializable
implementations with the macro. I have implemented some attributes to cover various edge cases I ran into across the current manual implementations:deprecated
- Allows the generated deserializer to emit diagnostics when the field is set. It doesn't alter any other behavior of the deserializer, so I had to add amigrate_deprecated_fields()
method toLoadedConfiguration
that "migrates" theindent_size
usages toindent_width
.disallow_empty
- Some deserializers would report diagnostics if an empty value was set. This behavior can be achieved with the macro by adding adisallow_empty
attribute to the field.from_none
- I have already gotten rid of mostNoneState
usages in this PR, but for structs with a customDefault
, theDeserializable
derive can be instructed to initialize using theNoneState
instead ofDefault
. I expect this attribute (andNoneState
altogether) can be dropped when thePartial
derive is implemented in a follow-up PR.passthrough_name
- For passing down rule names (see docs).rename
- Similar to Serde'srename
attribute (and it even supports Serde's attributes for types that also deriveSerialize
/Deserialize
).Extended documentation about the macro is also in
biome_deserialize_macro/src/lib.rs
in the derive's doc comment.I'm also considering allowing users to manually implement a
validate()
function that would run after thedeserialize()
function. Probably through aDeserializableValidator
trait, which the derive macro can be made aware of through another attribute. This would allow the auto-generated deserializers inrules.rs
(and one or two others) to also be replaced with the derive macro.Test Plan
CI should remain green.