Do not remove None values in RepoCardData serialization #2626
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
More details in huggingface/datasets#7243 (comment) and especially huggingface/datasets#7243 (comment).
When serializing repo cards metadata, we are currently removing any none values from any dictionary, set, list or tuple. This is done since the module has been introduced in #940 and never caused problems until huggingface/datasets#7243 was reported. I do find automatically removing None values inside values to be pretty clunky as it is values set by the user and not default ones.
This PR updates this behavior to remove only top-level None values, i.e. values that are None by default and that therefore shouldn't be serialized (as not set by the users themselves). The
_remove_none
helper is still kept to remove None values from the model-index (where it's rightfully necessary), but that's all.