You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Our configuration has default values.
In some complex case, there is some value to be able to keep the information
"the value is missing vs it was the default".
Let's pick an example.
To make sure a text is indexed the user just needs to mark the field as indexed. It happens to be the default.
They can also set a tokenizer, or record option.
We currently store those in an Option. The Option gives us the possibility to do an extra validation phase to let the user know that their configuration does not make sense.
For instance:
indexed: falsetokenizer: raw //< the user did set a tokenizer, when in fact the field is not indexed. This is probably an error on their side.
When we build the tantivy TextOptions or whatever, we simply use a default value upon access.
This zealous validation leads us to a more serious problem.
We actually never replace this None by the default value. What we do, is that upon access, when we build the tantivy TextOptions, we use the default value in place of None.
As a result, when we serialize the configuration to store in the metastore, it will still contain None.
This makes backward compatibility a bit of a nightmare. Now getting the behavior correctly will force us to add
adhoc conversion code when we want to change our default values.
Instead we want to interpret those defaults, and make sure that after reserialization in the Metastore those defaults value are there.
Worst even... This change in semantics cannot be caught by our regression tests as they rely on deserialization / serialization.
There are several ways to get there, involving more or less code.
The text was updated successfully, but these errors were encountered:
Our configuration has default values.
In some complex case, there is some value to be able to keep the information
"the value is missing vs it was the default".
Let's pick an example.
To make sure a text is indexed the user just needs to mark the field as indexed. It happens to be the default.
They can also set a tokenizer, or record option.
We currently store those in an
Option
. TheOption
gives us the possibility to do an extra validation phase to let the user know that their configuration does not make sense.For instance:
When we build the tantivy TextOptions or whatever, we simply use a default value upon access.
This zealous validation leads us to a more serious problem.
We actually never replace this None by the default value. What we do, is that upon access, when we build the tantivy TextOptions, we use the default value in place of None.
As a result, when we serialize the configuration to store in the metastore, it will still contain
None
.This makes backward compatibility a bit of a nightmare. Now getting the behavior correctly will force us to add
adhoc conversion code when we want to change our default values.
Instead we want to interpret those defaults, and make sure that after reserialization in the Metastore those defaults value are there.
Worst even... This change in semantics cannot be caught by our regression tests as they rely on deserialization / serialization.
There are several ways to get there, involving more or less code.
The text was updated successfully, but these errors were encountered: