feat(project): `Deserializable` derive macro #1564

arendjr · 2024-01-14T20:10:05Z

Summary

This implements the Deserializable derive macro and replaces most of the manual Deserializable implementations with the macro. I have implemented some attributes to cover various edge cases I ran into across the current manual implementations:

deprecated - Allows the generated deserializer to emit diagnostics when the field is set. It doesn't alter any other behavior of the deserializer, so I had to add a migrate_deprecated_fields() method to LoadedConfiguration that "migrates" the indent_size usages to indent_width.
disallow_empty - Some deserializers would report diagnostics if an empty value was set. This behavior can be achieved with the macro by adding a disallow_empty attribute to the field.
from_none - I have already gotten rid of most NoneState usages in this PR, but for structs with a custom Default, the Deserializable derive can be instructed to initialize using the NoneState instead of Default. I expect this attribute (and NoneState altogether) can be dropped when the Partial derive is implemented in a follow-up PR.
passthrough_name - For passing down rule names (see docs).
rename - Similar to Serde's rename attribute (and it even supports Serde's attributes for types that also derive Serialize/Deserialize).

Extended documentation about the macro is also in biome_deserialize_macro/src/lib.rs in the derive's doc comment.

I'm also considering allowing users to manually implement a validate() function that would run after the deserialize() function. Probably through a DeserializableValidator trait, which the derive macro can be made aware of through another attribute. This would allow the auto-generated deserializers in rules.rs (and one or two others) to also be replaced with the derive macro.

Test Plan

CI should remain green.

netlify · 2024-01-14T20:10:10Z

✅ Deploy Preview for biomejs canceled.

Name	Link
🔨 Latest commit	`4c8ec54`
🔍 Latest deploy log	https://app.netlify.com/sites/biomejs/deploys/65a83c3361549800086eee99

arendjr · 2024-01-14T20:17:08Z

crates/biome_service/tests/invalid/formatter_extraneous_field.json.snap

@@ -18,7 +18,6 @@ formatter_extraneous_field.json:3:3 deserialize ━━━━━━━━━━
  - enabled
  - formatWithErrors
  - indentStyle
-  - indentSize


I let the macro filter out deprecated keys, since even though they're technically accepted, I don't think they're a very useful suggestion to the user.

arendjr · 2024-01-14T20:19:22Z

crates/biome_service/src/configuration/parse/json/vcs.rs

-                })
+                DeserializationDiagnostic::new(
+                    "You enabled the VCS integration, but you didn't specify a client.",
+                )


The check here is an example of a use case that could be solved through a DeserializableValidator trait. If a struct contains a #[deserializable(with_validator)] annotation, we could let the generated deserializer call into a validate() function that would have the ability to reject the deserialized instance. But I haven't built that yet :)

Did you notice other code that requires a validation?

Yeah, for instance LineWidth which checks whether the u16 is within its own MIN and MAX. And also a lot of the types in the generated rules.rs have a similar kind of validation logic.

arendjr · 2024-01-14T20:21:26Z

crates/biome_service/src/configuration/parse/json/mod.rs

 mod linter;
-mod organize_imports;
-mod overrides;


I think if the rules submodule below gets cleaned up, which requires letting the generated rule structs also tap into the macro, it might make sense to just get rid of the configuration::parse module altogether. The few remaining manual deserializers could easily be implemented along the structs they implement.

codspeed-hq · 2024-01-14T20:47:26Z

CodSpeed Performance Report

Merging #1564 will not alter performance

_{Comparing arendjr:deserializable_derive (4c8ec54) with main (a46230d)}

Summary

✅ 93 untouched benchmarks

Conaclos

At first glance it looks good! I haven't taken the time to check the implementation yet. I left some suggestions for the docs.

Conaclos · 2024-01-15T11:53:17Z

crates/biome_deserialize/README.md

@@ -183,200 +206,6 @@ assert_eq!(deserialized, None);
 assert_eq!(diagnostics..len(), 1);
 ```

-### Deserializing an enumeration of values


I think we should keep this doc. This is an important resource that allows understanding how this works under the hood. This also allows implementing an exotic deserializer that is not covered by the derive macro.

We could change the title to Implementing a deserializer for an enum. We could also add a note in a first paragraph telling that we now have a derive macro to generate a deserializer.

I mostly deleted this because I don't think it makes sense to maintain the code snippets so they stay in sync with the macro. Rust Analyzer offers inspection of macros, so you can easily peek under the hood if you want to.

But indeed I also removed some explanatory text, which I will try to reintroduce in way that makes more sense with the new situation. I left the section for deserializing unions in place, and I just notice there is actually now an easier way to implement that one too (by checking is_type() instead of implementing a visitor), so I will update that one and try to retrofit most of the explanatory text there.

It would be nice to keep at least one example to manually implement the deserialisation, like serde does.

Conaclos · 2024-01-15T11:53:56Z

crates/biome_deserialize/README.md

-}
-```
-
-### Deserializing a struct


We could change the title to Implementing a deserializer for a struct. We could also add a note in a first paragraph telling that we now have a derive macro to generate a deserializer.

Conaclos · 2024-01-15T12:05:12Z

crates/biome_js_analyze/src/aria_analyzers/a11y/use_valid_aria_role.rs

-    allowed_invalid_roles: Vec<String>,
+    allow_invalid_roles: Vec<String>,


Why the name of the property was changed? This will break user config?

The JSON property it maps to is allowInvalidRoles, so I changed the name in the struct to match it. This should actually prevent breaking the user config.

Conaclos · 2024-01-15T12:49:06Z

Some extra comments / suggestions:

Instead of creating our own deserializable(deprecated) attribute, could we reuse the rust standard deprecated attribute?
deserializable(passthrough_name) is only used once. I could keep a manual implementation of Deserializable for RuleWithOptions and remove the passthrough_name attribute
deserializable(disallow_empty) is only used twice. We could handle these cases via validation and we could drop thus deserializable(disallow_empty).
I hope we will get rid of from_none.

I'm also considering allowing users to manually implement a validate() function that would run after the deserialize() function. Probably through a DeserializableValidator trait, which the derive macro can be made aware of through another attribute. This would allow the auto-generated deserializers in rules.rs (and one or two others) to also be replaced with the derive macro.

It is common in Rust of using the newtype idom to add data validation. If we have only a few valdiation case, we could certainly use this idiom and manually implement Deserializable with a validation step.
Just as an example: to reject empty strings:

struct NonEmptyString(String);

impl NonEmptyString {
  fn new(inner: String) -> Self {
    assert!(!inner.is_empty(), "the string should not be empty");
    Self(inner)
  }
}

impl Deserializable for NonEmptyString {
  fn deserialize(value: &impl DeserializableValue, name: &str, diagnostics: &mut Vec<DeserializationDiagnostic>,) -> Option<Self> {
    let result = Deserializable::deserialize(value, name, diagnostics);
    if result.is_some_and(|result| result.is_empty()) {
      // emit diagnostic
    }
    result
  }
}

arendjr · 2024-01-15T18:55:04Z

Instead of creating our own deserializable(deprecated) attribute, could we reuse the rust standard deprecated attribute?

I don't think that would be suitable for generating the type of diagnostics we want.

deserializable(passthrough_name) is only used once. I could keep a manual implementation of Deserializable for RuleWithOptions and remove the passthrough_name attribute

That would work too, although I actually quite like the attribute because it makes the behavior explicit. When a deserializer is implemented manually, there is a lot going on, and in a case like this it's easy to overlook why it's implemented manually. I only noticed it after tests started failing when I accidentally broke the functionality...

deserializable(disallow_empty) is only used twice. We could handle these cases via validation and we could drop thus deserializable(disallow_empty).

Yeah, I agree it's a good idea if we solve these through a more generic validation mechanism.

I hope we will get rid of from_none.

That's the plan!

It is common in Rust of using the newtype idom to add data validation. If we have only a few valdiation case, we could certainly use this idiom and manually implement Deserializable with a validation step.

I do think there's quite a few though, especially when considering the generated rule groups. It also feels a bit off that if you want validation, you need to implement a deserializer. Ideally it would like to keep the concerns separated, with deserialization covered as much by the macro as we can, while leaving validation to manual implementation.

ematipico · 2024-01-15T19:34:34Z

The Rust standard deprecated attribute is for internal code, not user code.

Marking something deprecated will generate an error in clippy

Conaclos · 2024-01-16T11:18:35Z

The Rust standard deprecated attribute is for internal code, not user code.

You are right. This should certainly be separated.

Marking something deprecated will generate an error in clippy

indent_size should also be internally deprecated because it is a public interface for crate users...

Conaclos · 2024-01-16T11:41:27Z

deserializable(passthrough_name) is only used once. I could keep a manual implementation of Deserializable for RuleWithOptions and remove the passthrough_name attribute

That would work too, although I actually quite like the attribute because it makes the behavior explicit. When a deserializer is implemented manually, there is a lot going on, and in a case like this it's easy to overlook why it's implemented manually. I only noticed it after tests started failing when I accidentally broke the functionality...

I have to admit that I am on the fence here.

The name property has been designed to name the deserialized value in potential diagnostics. Ideally this should be a localized information. By providing a special attribute to deviate from this, I am a bit afraid of sending the wrong signal. However, some users of the crate already use name to pass the filename...

In the short-to-mean term, I would like to get rid of PossibleOptions. PossibleOptions has several design problems. If we get rid of PossibleOptions, we no longer have the internal use for deserializable(passthrough_name).

Maybe we could keep deserializable(passthrough_name) for now and revisit the need for it later.

It is common in Rust of using the newtype idom to add data validation. If we have only a few valdiation case, we could certainly use this idiom and manually implement Deserializable with a validation step.

I do think there's quite a few though, especially when considering the generated rule groups. It also feels a bit off that if you want validation, you need to implement a deserializer. Ideally it would like to keep the concerns separated, with deserialization covered as much by the macro as we can, while leaving validation to manual implementation.

I prefer the newtype idiom because it seems more idiomatic in Rust. This encodes constraints that the data must satisfy.
If we have too much valdiation, we could, indeed, introduce an attribute. For example deserializable(validate = validate_function):

#[derive(Deserializable)]
#[deserializable(validate = Options::validate)]
struct Options {
    name: String,
}

impl Options {
    fn validate(&self, diagnostics: Vec<...>) {
        if self.name.is_empty() {
            // emit diagnostic
        }
    }
}

arendjr · 2024-01-16T18:47:58Z

In the short-to-mean term, I would like to get rid of PossibleOptions. PossibleOptions has several design problems. If we get rid of PossibleOptions, we no longer have the internal use for deserializable(passthrough_name).

Maybe if we use a generic for the options in RuleWithOptions that would solve the issue? I don't know what the implications of that for our JSON schema generation are though, so I'll leave that out of scope for this PR.

But yeah, if we don't need deserializable(passthrough_name) anymore, I'd be happy to remove it :)

I prefer the newtype idiom because it seems more idiomatic in Rust. This encodes constraints that the data must satisfy.
If we have too much valdiation, we could, indeed, introduce an attribute.

Yeah, I think I will indeed go with the NonEmptyString newtype for those use cases. And of course LineWidth is already a newtype, but there I want to extract the validation out of the deserializer. That way we can use the macro, while keeping the validation on the newtype as well. Only for complex custom types such as A11y I think it makes more sense to do the validation directly on the type than to introduce a newtype wrapper.

I was thinking of introducing a DeserializableValidator trait so that it is always clear what the signature of the method to implement is, and the IDE will be able to autocomplete it.

arendjr · 2024-01-16T18:48:41Z

indent_size should also be internally deprecated because it is a public interface for crate users...

I can add that in this PR, I think, assuming CI won't complain :)

arendjr · 2024-01-16T20:10:30Z

Hmm, bummer. I added the NonZeroString type, but I cannot add it to the biome_deserialize create, because then it cannot derive JsonSchema. It doesn't seem very reusable compared to an #[deserializable(validator = "non_empty")] annotation.

Conaclos · 2024-01-16T20:14:17Z

Maybe if we use a generic for the options in RuleWithOptions that would solve the issue? I don't know what the implications of that for our JSON schema generation are though, so I'll leave that out of scope for this PR.

That's not so simple because not every rule has options.
Indeed, this is out of scope of this PR.

I was thinking of introducing a DeserializableValidator trait so that it is always clear what the signature of the method to implement is, and the IDE will be able to autocomplete it.

How the derive macro could know if the type implements the DeserializableValidator trait?

arendjr · 2024-01-16T20:27:57Z

How the derive macro could know if the type implements the DeserializableValidator trait?

I was thinking to just add an attribute for that.

arendjr · 2024-01-17T19:32:16Z

I'll revert the indent_size deprecation for now, since the linter fails on it.

This reverts commit b9fcb29.

arendjr · 2024-01-17T20:48:29Z

Alright, when all is green it's time to merge :)

arendjr added 6 commits January 13, 2024 18:32

Implement Deserializable derive

c17fa5d

Merge branch 'main' into deserializable_derive

4c4cf53

Merge with NoneState updates

dda0522

Migrate more Deserializable impls to the macro

baa2023

Migrate yet more Deserializable impls to the macro

dc823ec

This should cover them for now

f1354b9

github-actions bot added A-Project Area: project A-Linter Area: linter A-Formatter Area: formatter A-Tooling Area: internal tools L-JavaScript Language: JavaScript and super languages labels Jan 14, 2024

arendjr requested review from Conaclos, ematipico and faultyserver January 14, 2024 20:10

arendjr commented Jan 14, 2024

View reviewed changes

just ready

b531f65

Conaclos reviewed Jan 15, 2024

View reviewed changes

ematipico approved these changes Jan 16, 2024

View reviewed changes

Conaclos approved these changes Jan 16, 2024

View reviewed changes

More generic validation

551f351

github-actions bot added the A-CLI Area: CLI label Jan 17, 2024

arendjr added 2 commits January 17, 2024 20:20

Merge branch 'main' into deserializable_derive

c381ea5

Mark indent_size as deprecated in Rust as well

b9fcb29

arendjr added 4 commits January 17, 2024 20:32

Format

2af2e0e

Revert "Mark indent_size as deprecated in Rust as well"

87c7ad5

This reverts commit b9fcb29.

Let generated rules structs rely on the Deserializable macro

427989f

Update docs

4c8ec54

arendjr merged commit 09db855 into biomejs:main Jan 17, 2024
17 checks passed

arendjr deleted the deserializable_derive branch January 17, 2024 21:16

ematipico pushed a commit to DaniGuardiola/biome that referenced this pull request Jan 24, 2024

feat(project): Deserializable derive macro (biomejs#1564)

a15397c

Conaclos added a commit that referenced this pull request Mar 12, 2024

doc: add docs removed in #1564

02cab38

Conaclos mentioned this pull request Mar 12, 2024

doc: add docs removed in #1564 #2067

Merged

Conaclos added a commit that referenced this pull request Mar 12, 2024

doc: add docs removed in #1564 (#2067)

e2bcaa4

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(project): `Deserializable` derive macro #1564

feat(project): `Deserializable` derive macro #1564

arendjr commented Jan 14, 2024 •

edited

Loading

netlify bot commented Jan 14, 2024 •

edited

Loading

arendjr Jan 14, 2024

arendjr Jan 14, 2024

Conaclos Jan 15, 2024

arendjr Jan 15, 2024

arendjr Jan 14, 2024

codspeed-hq bot commented Jan 14, 2024 •

edited

Loading

Conaclos left a comment •

edited

Loading

Conaclos Jan 15, 2024

arendjr Jan 15, 2024

ematipico Jan 16, 2024

Conaclos Jan 15, 2024

Conaclos Jan 15, 2024

arendjr Jan 15, 2024

Conaclos commented Jan 15, 2024 •

edited

Loading

arendjr commented Jan 15, 2024

ematipico commented Jan 15, 2024

Conaclos commented Jan 16, 2024

Conaclos commented Jan 16, 2024

arendjr commented Jan 16, 2024

arendjr commented Jan 16, 2024

arendjr commented Jan 16, 2024

Conaclos commented Jan 16, 2024

arendjr commented Jan 16, 2024

arendjr commented Jan 17, 2024

arendjr commented Jan 17, 2024

		allowed_invalid_roles: Vec<String>,
		allow_invalid_roles: Vec<String>,

feat(project): Deserializable derive macro #1564

feat(project): Deserializable derive macro #1564

Conversation

arendjr commented Jan 14, 2024 • edited Loading

Summary

Test Plan

netlify bot commented Jan 14, 2024 • edited Loading

✅ Deploy Preview for biomejs canceled.

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

codspeed-hq bot commented Jan 14, 2024 • edited Loading

Merging #1564 will not alter performance

Summary

Conaclos left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Conaclos commented Jan 15, 2024 • edited Loading

arendjr commented Jan 15, 2024

ematipico commented Jan 15, 2024

Conaclos commented Jan 16, 2024

Conaclos commented Jan 16, 2024

arendjr commented Jan 16, 2024

arendjr commented Jan 16, 2024

arendjr commented Jan 16, 2024

Conaclos commented Jan 16, 2024

arendjr commented Jan 16, 2024

arendjr commented Jan 17, 2024

arendjr commented Jan 17, 2024

feat(project): `Deserializable` derive macro #1564

feat(project): `Deserializable` derive macro #1564

arendjr commented Jan 14, 2024 •

edited

Loading

netlify bot commented Jan 14, 2024 •

edited

Loading

codspeed-hq bot commented Jan 14, 2024 •

edited

Loading

Conaclos left a comment •

edited

Loading

Conaclos commented Jan 15, 2024 •

edited

Loading