Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Revisit the way we handle default / Option::None in configuration #3472

Closed
fulmicoton opened this issue Jun 2, 2023 · 0 comments · Fixed by #3691
Closed

Revisit the way we handle default / Option::None in configuration #3472

fulmicoton opened this issue Jun 2, 2023 · 0 comments · Fixed by #3691
Assignees
Labels
bug Something isn't working

Comments

@fulmicoton
Copy link
Contributor

fulmicoton commented Jun 2, 2023

Our configuration has default values.
In some complex case, there is some value to be able to keep the information
"the value is missing vs it was the default".

Let's pick an example.
To make sure a text is indexed the user just needs to mark the field as indexed. It happens to be the default.
They can also set a tokenizer, or record option.

We currently store those in an Option. The Option gives us the possibility to do an extra validation phase to let the user know that their configuration does not make sense.

For instance:

indexed: false
tokenizer: raw //< the user did set a tokenizer, when in fact the field is not indexed. This is probably an error on their side.

When we build the tantivy TextOptions or whatever, we simply use a default value upon access.

This zealous validation leads us to a more serious problem.

We actually never replace this None by the default value. What we do, is that upon access, when we build the tantivy TextOptions, we use the default value in place of None.

As a result, when we serialize the configuration to store in the metastore, it will still contain None.
This makes backward compatibility a bit of a nightmare. Now getting the behavior correctly will force us to add
adhoc conversion code when we want to change our default values.

Instead we want to interpret those defaults, and make sure that after reserialization in the Metastore those defaults value are there.

Worst even... This change in semantics cannot be caught by our regression tests as they rely on deserialization / serialization.

There are several ways to get there, involving more or less code.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
No open projects
Status: Done
Development

Successfully merging a pull request may close this issue.

2 participants