Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix schema validation and add custom validators #3220

Merged
merged 3 commits into from
Jan 17, 2023

Conversation

adrien-berchet
Copy link
Contributor

Description

Fixes #3217 that breaks when jsonschema is not installed (sorry about that 🤦 ).
Also add the possibility to provide a custom validator as schema instead of just relying on jsonschema.validate.

@adrien-berchet
Copy link
Contributor Author

Sorry @dlstadther , I went too fast in #3217 . This time it should be fine.

Copy link
Collaborator

@dlstadther dlstadther left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for catching and fixing your error.

I do have a clarifying question before approving

unfrozen_value = recursively_unfreeze(frozen_value)
try:
self.schema.validate(unfrozen_value) # Validators may update the instance inplace
frozen_value = super().normalize(unfrozen_value)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand the try/except here to handle for when self.schema is a custom validator, but could you explain why you re-freeze the unfrozen value again? Why/how would the validator modify the value of the instance such that we cannot return the frozen_value from the beginning of this method?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is possible to define custom validators to handle specific schemas or do extra stuffs: https://python-jsonschema.readthedocs.io/en/stable/creating/
For example, it is possible to fill missing values with default values: https://python-jsonschema.readthedocs.io/en/stable/faq/#why-doesn-t-my-schema-s-default-property-set-the-default-on-my-instance
In this example the given instance is updated with default values during the validation, so the new instance must be frozen again and returned.
Nevertheless, we can indeed simplify this a bit: unfreeze the given value when a schema is provided and just freeze the result, instead of freezing, unfreezing and freezing again. I pushed a new commit for this.

unfrozen_value = recursively_unfreeze(value)
try:
self.schema.validate(unfrozen_value)
value = unfrozen_value # Validators may update the instance inplace
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I defined the unfrozen_value variable for clarity but we could just overwrite the value variable at line 1141. What do you prefer?

@honnix
Copy link
Member

honnix commented Jan 17, 2023

Let me try to understand the situation a bit better. Does it mean the main branch of Luigi is broken? We plan to drop a new release and if that is the case, we would suggest reverting the breaking PR so we have a healthy main branch, and you will also be able to work on this without a rush @adrien-berchet .

@adrien-berchet
Copy link
Contributor Author

Hi @honnix
Well, the main branch is indeed broken but it will not break any existing code, it will only break if someone tries to use the new feature without jsonschema installed.
With this new PR everything looks good in my tests.

@adrien-berchet
Copy link
Contributor Author

adrien-berchet commented Jan 17, 2023

Note that the CI passed, only the coverage reported a failure but I am pretty sure it is not due to this PR (2 lines in luigi.contrib.hive were not covered).

EDIT: The issue is here: https://github.com/spotify/luigi/actions/runs/3939604194/jobs/6739654892
The upload to Codecov failed so the lines are not reported as covered but they actually were.

Copy link
Collaborator

@dlstadther dlstadther left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand now! Thank you!

@adrien-berchet
Copy link
Contributor Author

You're welcome 😊

@dlstadther dlstadther merged commit ddd94d0 into spotify:master Jan 17, 2023
@adrien-berchet adrien-berchet deleted the improve_schema branch January 17, 2023 22:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants