Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SCHEMA] Apply schema rules to entity values #792

Merged
merged 4 commits into from
Jul 20, 2021

Conversation

tsalo
Copy link
Member

@tsalo tsalo commented May 3, 2021

Closes #809.

The goal of this PR is to move toward applying the rules that govern metadata values (in #774) to entity values as well. Those rules support a wide range of possible values and restrictions, but the most important one right now is type: string combined with enum, which lets us list specific acceptable values for the part and mt entities.

Changes proposed:

  • Add type: string to all entity definitions in the schema.
  • Replace format: label with enum: <list> for the part and mt entities.
  • Add an "Allowed values" subsection to entity sections in the Entities appendix for those entities with restricted values.
  • Default to format = label when format is not defined in entity schema definitions (just to work around the new enums).

Pending questions:

  1. Should I use type: integer for any of the index-format entities? I kept type: string because of the leading zeros issue, although I'm not sure if that applies to all entities or just run.
    • Answered. They should be strings but we'll ultimately have an index-specific format to validate against.

tsalo added 2 commits May 3, 2021 14:09
By using the same tools, we can also use things like `enum` to limit values.
The current solution is a bit of a hack but it's easy to adjust. Basically, when `format` is missing from the schema, default to "label".
@tsalo
Copy link
Member Author

tsalo commented May 3, 2021

I'm thinking it might also be a good idea to create a new formats.yaml that defines the different formats used in the schema. Something like:

label:
    description: |
        Freeform labels without special characters.
    pattern: ^[0-9a-zA-Z]+$
index:
    description: |
        Non-negative, non-zero integers, optionally prefixed with leading zeros for sortability.
    pattern: ^[0-9]+$
unit:
    description: |
        A unit.
    pattern: ^[0-9a-zA-Z]+$

Unfortunately, I don't actually know how most of those patterns would look. Units, for example, can be freeform text if necessary, but optimally they should be valid combinations of the values in the Units appendix. Plus we have relative paths of multiple flavors, along with URIs. I think uri, at least, is offloaded to JSON Schema in the validator.

@tsalo tsalo added the schema Issues related to the YAML schema representation of the specification. Patch version release. label May 4, 2021
@sappelhoff
Copy link
Member

Should I use type: integer for any of the index-format entities? I kept type: string because of the leading zeros issue, although I'm not sure if that applies to all entities or just run.

I think the leading zeros is recommended for each index-format entity ... to make sorting more reasonable: That's nothing run-specific. Perhaps we can say it must be a string of digits? So that a conversion to "int" of the string does not result in a ValueError, as in this Python example:

int("0001")  # 1
float("0001")  # 1.0
float("0001a")  # ValueError, could not convert string to float: '0001a'

@tsalo
Copy link
Member Author

tsalo commented Jul 1, 2021

Sounds good. I'll leave it as string then.

EDIT: With the potential for an index-specific format, of course.

Copy link
Collaborator

@effigies effigies left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

tools/schemacode/schema.py Outdated Show resolved Hide resolved
Co-authored-by: Chris Markiewicz <[email protected]>
@tsalo tsalo requested a review from yarikoptic July 15, 2021 15:12
Copy link
Member

@sappelhoff sappelhoff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tsalo
Copy link
Member Author

tsalo commented Jul 15, 2021

Thanks! The second approval triggers the five-day waiting period and I'll merge next Thursday if no one raises any concerns.

Regarding formats.yaml... I think I'll open a new issue about it rather than try to shoehorn it in here.

@sappelhoff
Copy link
Member

I think you can merge starting from today, as it's been 5 days :-)

image

@tsalo tsalo merged commit 96131df into bids-standard:master Jul 20, 2021
@tsalo tsalo deleted the entity-values branch July 20, 2021 15:02
@tsalo tsalo added schema-structure Changes to the fundamental organization/structure of the YAML schema. Minor version release. schema-code Updates or changes to the code used to parse, filter, and render the schema. labels Apr 11, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
schema Issues related to the YAML schema representation of the specification. Patch version release. schema-code Updates or changes to the code used to parse, filter, and render the schema. schema-structure Changes to the fundamental organization/structure of the YAML schema. Minor version release.
Projects
Development

Successfully merging this pull request may close these issues.

Specify allowed values for entities that requires specific values in bids schema
3 participants