Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FIX] Rewrite inheritance principle #946

Merged
Merged
Changes from 17 commits
Commits
Show all changes
19 commits
Select commit Hold shift + click to select a range
c230301
[FIX] Rewrite inheritance principle
Lestropie Dec 5, 2021
2202697
[FIX] Fix trailing space src/02_common_principles.md #946
Lestropie Dec 6, 2021
0abc930
Inheritance principle: Use bullet points
Lestropie Dec 12, 2021
1bcf1b5
Inheritance principle: Multiple changes
Lestropie Dec 12, 2021
7f52984
Inheritance principle: Move to follow "file formation specification"
Lestropie Dec 12, 2021
399a907
Inheritance principle: Formatting fixes
Lestropie Dec 13, 2021
8091e3c
Inheritance principle: Tweak to corollary
Lestropie Dec 13, 2021
1839b58
Common principles: Harmonize "file name" -> "filename"
Lestropie Dec 19, 2021
01aa790
02-common-principles.md: Formatting fixes for #946
Lestropie Dec 20, 2021
7ba50a6
Inheritance principle: Fix JSON formatting
Lestropie Dec 20, 2021
7bc7ad5
Inheritance principle: Minor changes and formatting
Lestropie Jan 5, 2022
fc604ef
Inheritance principle: No unsetting as corollary
Lestropie Jan 11, 2022
2590d0a
Inheritance principle: Rephrase corollary RE: unsetting
Lestropie Jan 12, 2022
1141cf7
Merge branch 'rewrite_inheritance_principle' into inheritance_unsetti…
Lestropie Jan 12, 2022
5475322
Merge pull request #3 from Lestropie/inheritance_unsetting_corollary
Lestropie Jan 18, 2022
97d319b
Merge branch 'master' into rewrite_inheritance_principle
sappelhoff Jan 24, 2022
33cc25a
Inheritance principle: Revise EchoTime and RepetitionTime exemplar va…
Lestropie Jan 31, 2022
7af3e05
Inheritance principle: Use American spelling
Lestropie Feb 1, 2022
525ab87
Inheritance principle: Fix enumeration cross-reference
Lestropie Feb 1, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
250 changes: 146 additions & 104 deletions src/02-common-principles.md
Original file line number Diff line number Diff line change
Expand Up @@ -399,110 +399,6 @@ In particular, if a BIDS dataset contains a `derivatives/` sub-directory,
the contents of that directory may be a heterogeneous mix of BIDS Derivatives
datasets and non-compliant derivatives.

## The Inheritance Principle

Any metadata file (such as `.json`, `.bvec` or `.tsv`) may be defined at any
directory level, but no more than one applicable file may be defined at a given
level (Example 1). The values from the top level are inherited by all lower
levels unless they are overridden by a file at the lower level. For example,
`sub-*_task-rest_bold.json` may be specified at the participant level, setting
TR to a specific value. If one of the runs has a different TR than the one
specified in that file, another `sub-*_task-rest_bold.json` file can be placed
within that specific series directory specifying the TR for that specific run.
There is no notion of "unsetting" a key/value pair.
Once a key/value pair is set in a given level in the dataset, lower down in
the hierarchy that key/value pair will always have some assigned value.
Files for a particular participant can exist only at participant level directory,
that is, `/dataset/sub-*[/ses-*]/sub-*_T1w.json`. Similarly, any file that is not
specific to a participant is to be declared only at top level of dataset for example:
`task-sist_bold.json` must be placed under `/dataset/task-sist_bold.json`

Example 1: Two JSON files that are erroneously at the same level

{{ MACROS___make_filetree_example(
{
"sub-01": {
"ses-test":{
"sub-01_ses-test_task-overtverbgeneration_bold.json": "",
"sub-01_ses-test_task-overtverbgeneration_run-2_bold.json": "",
"anat": {
"sub-01_ses-test_T1w.nii.gz": "",
},
"func": {
"sub-01_ses-test_task-overtverbgeneration_run-1_bold.nii.gz": "",
"sub-01_ses-test_task-overtverbgeneration_run-2_bold.nii.gz": "",
}
}
}
}
) }}

In the above example, two JSON files are listed under `sub-01/ses-test/`, which
are each applicable to
`sub-01_ses-test_task-overtverbgeneration_run-2_bold.nii.gz`, violating the
constraint that no more than one file may be defined at a given level of the
directory structure. Instead `sub-01_ses-test_task-overtverbgeneration_run-2_bold.json`
should have been under `sub-01/ses-test/func/`.

Example 2: Multiple `run`s and `rec`s with same acquisition (`acq`) parameters

{{ MACROS___make_filetree_example(
{
"sub-01": {
"anat": {},
"func": {
"sub-01_task-xyz_acq-test1_run-1_bold.nii.gz": "",
"sub-01_task-xyz_acq-test1_run-2_bold.nii.gz": "",
"sub-01_task-xyz_acq-test1_rec-recon1_bold.nii.gz": "",
"sub-01_task-xyz_acq-test1_rec-recon2_bold.nii.gz": "",
"sub-01_task-xyz_acq-test1_bold.json": "",
}
}
}
) }}

For the above example, all NIfTI files are acquired with same scanning
parameters (`acq-test1`). Hence a JSON file describing the acq parameters will
apply to different runs and rec files. Also if the JSON file
(`task-xyz_acq-test1_bold.json`) is defined at dataset top level directory, it
will be applicable to all task runs with `test1` acquisition parameter.

Example 3: Multiple JSON files at different levels for same task and acquisition parameters

{{ MACROS___make_filetree_example(
{
"task-xyz_acq-test1_bold.json": "",
"sub-01": {
"anat": {},
"func": {
"sub-01_task-xyz_acq-test1_run-1_bold.nii.gz": "",
"sub-01_task-xyz_acq-test1_rec-recon1_bold.nii.gz": "",
"sub-01_task-xyz_acq-test1_rec-recon2_bold.nii.gz": "",
"sub-01_task-xyz_acq-test1_bold.json": "",
}
}
}
) }}

In the above example, the fields from the `task-xyz_acq-test1_bold.json` file
at the top directory will apply to all bold runs. However, if there is a key
with different value in the
`sub-01/func/sub-01_task-xyz_acq-test1_bold.json` file defined at a
deeper level, that value will be applicable for that particular run/task NIfTI
file/s. In other words, the `.json` file at the deeper level overrides values
that are potentially also defined in the `.json` at a more shallow level. If the
`.json` file at the more shallow level contains key-value-pairs that are not
present in the `.json` file at the deeper level, these key-value-pairs are
inherited by the `.json` file at the deeper level (but NOT vice versa!).

### Good practice recommendations

**Try to avoid excessive amount of overrides.** Do not specify a field
value in the upper levels if lower levels have more or less even
distribution of multiple possible values. For example, if a field `X` has one
value for all `ses-01/` and another for all `ses-02/` it better not to be
defined at all in the `.json` at the upper level.

## File Formation specification

### Imaging files
Expand Down Expand Up @@ -645,6 +541,152 @@ for more information.
}
```

## The Inheritance Principle

1. Any metadata file (such as `.json`, `.bvec` or `.tsv`) MAY be defined at any directory level.

1. For a given data file, any metadata file is applicable to that data file if:
yarikoptic marked this conversation as resolved.
Show resolved Hide resolved
1. It is stored at the same directory level or higher;
1. The metadata and the data filenames possess the same suffix;
1. The metadata filename does not include any entity absent from the data filename.

1. A metadata file MUST NOT have a filename that would be otherwise applicable
to some data file based on rules 2.b and 2.c but is made inapplicable based on its
location in the directory structure as per rule 2.a.

1. There MUST NOT be multiple metadata files applicable to a data file at one level
of the directory hierarchy.
Comment on lines +557 to +558
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW: it is such a nice concise rule which helps to avoid ambiguity and the workaround in Example 3 is quite cute ;)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OH -- I remembered what bothered me in my original #102 and outlined in Edit 1 there. Citing an example from there:

I placed myself into a corner with an example of having e.g.

sub-1_task-task1_run-1_bold.json and
sub-1_task-task1_acq-X_run-1_bold.json

per subject (should be ok), and then trying to aggregate over them while retaining also _acq- if defined.

and there are many other entities/use-cases which would fall into similar situation. What I am thinking is to add a clarification as subitem here or a separate rule:

  • A metadata file at the data files level of hierarchy MUST not be considered for inheritance if there is a matching in entities data file.

This would make sub-1_task-task1_run-1_bold.json not a to be considered for providing metadata to enrich sub-1_task-task1_acq-X_run-1_bold.json, and examples below would stay valid (we might want to add an example for this rule though)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is such a nice concise rule which helps to avoid ambiguity and the workaround in Example 3 is quite cute ;)

Well, actually (from first post):

Full disclosure: as per #259, I would like to pursue the prospect of modifying the inheritance principle itself, not just the description of such; specifically removal of the preclusion of having multiple applicable JSONs at one level of the hierarchy.

😬

The proposed changes here do at least I think demonstrate what would need to change in order to facilitate such. Any text I propose for such will need to be very clear for both users and software creators exactly what is permitted vs. not permitted and in what order JSONs should be loaded. But as I said in the original post, this is not a change in descriptive language but a change in prescriptive permissible structure, and so IMO necessitates either a minor or major version change rather than just a patch.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gotcha. Given the currently loose formulation of the principle, depending on the changes to it, I think it might be feasible to get it done within minor revision. So it would be great to see this PR be finalized/merged soon.


1. If multiple metadata files satisfy criteria 2.a-c above:

1. For [tabular files](#tabular-files) and other simple metadata files
(for instance, [`bvec` / `bval` files for diffusion MRI](#04-modality-specific-files/01-magnetic-resonance-imaging#required-gradient-orientation-information)),
accessing metadata associated with a data file MUST consider only the
applicable file that is lowest in the filesystem hierarchy.

1. For [JSON files](#key-value-files-dictionaries), key-values are loaded
from files from the top of the directory hierarchy downwards, such that
key-values from the top level are inherited by all data files at lower
levels to which it is applicable unless overridden by a value for the
same key present in another metadata file at a lower level
(though it is RECOMMENDED to minimise the extent of such overrides).
Lestropie marked this conversation as resolved.
Show resolved Hide resolved

Corollaries:

1. As per rule 3, metadata files applicable only to a specific participant / session
MUST be defined in or below the directory corresponding to that participant / session;
similarly, a metadata file that is applicable to multiple participants / sessions
MUST NOT be placed within a directory corresponding to only one such participant / session.

1. It is permissible for a single metadata file to be applicable to multiple data
files at that level of the hierarchy or below. Where such metadata content is consistent
across multiple data files, it is RECOMMENDED to store metadata in this
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RECOMMENDED

This is a best practice written in the language of a validatable rule. Some prefer this approach, others prefer to have their sidecars per-data-file. I think we should not make this recommendation, as validation will be tricky and annoying to users who would prefer the alternative.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm okay with taking out the recommended here.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a best practice written in the language of a validatable rule.

  1. This seems to extend beyond the RFC2119 definition. Is this an established BIDS-specific interpretation that is documented somewhere?

  2. Would the same criticism not also apply to 5.2?

    For JSON files, key-values are loaded from files from the top of the directory hierarchy downwards, such that key-values from the top level are inherited by all data files at lower levels to which it is applicable unless overridden by a value for the same key present in another metadata file at a lower level (though it is RECOMMENDED to minimise the extent of such overrides).

    Issuing a validator warning regarding the presence of individual key-value overrides may be unexpected.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this an established BIDS-specific interpretation that is documented somewhere?

AFAIK this is not documented anywhere, but among maintainers we have often discussed that RECOMMENDED corresponds to a "warning" level in the validator, and MUST corresponds to an "error" level in the validator. The main point of this consideration is to not design a specification that we cannot reasonably validate.

Having that said, there are many cases in the spec where we RECOMMEND or require (MUST), where validation is currently not happening and might be difficult to implement --- as you correctly point out in your second point.

Personally I am fine with having some recommendations that we cannot "warn" about (see especially 5.b here, but maybe also the point raised by Chris.).

way, rather than duplicating that metadata content across multiple metadata files.

1. Where multiple applicable JSON files are loaded as per rule 5.2, key-values can
Lestropie marked this conversation as resolved.
Show resolved Hide resolved
only be overwritten by files lower in the filesystem hierarchy; the absence of
a key-value in a later file does not imply the "unsetting" of that field
(indeed removal of existing fields is not possible).

Example 1: Demonstration of inheritance principle

{{ MACROS___make_filetree_example(
{
"sub-01": {
"func": {
"sub-01_task-rest_acq-default_bold.nii.gz": "",
"sub-01_task-rest_acq-longtr_bold.nii.gz": "",
"sub-01_task-rest_acq-longtr_bold.json": "",
}
},
"task-rest_bold.json": "",
}
) }}

Contents of file `task-rest_bold.json`:

```JSON
{
"EchoTime": 0.040,
"RepetitionTime": 1.0
}
```

Contents of file `sub-01/func/sub-01_task-rest_acq-longtr_bold.json`:

```JSON
{
"RepetitionTime": 3.0
}
```

When reading image `sub-01/func/sub-01_task-rest_acq-default_bold.nii.gz`, only
metadata file `task-rest_bold.json` is read; file
`sub-01/func/sub-01_task-rest_acq-longtr_bold.json` is inapplicable as it contains
entity "`acq-longtr`" that is absent from the image path (rule 2.c). When reading image
`sub-01/func/sub-01_task-rest_acq-longtr_bold.nii.gz`, metadata file
`task-rest_bold.json` at the top level is read first, followed by file
`sub-01/func/sub-01_task-rest_acq-longtr_bold.json` at the bottom level (rule 5.b);
the value for field "`RepetitionTime`" is therefore overridden to the value `3.0`.
The value for field "`EchoTime`" remains applicable to that image, and is not unset by its
absence in the metadata file at the lower level (rule 5.b; corollary 3).

Example 2: Impermissible use of multiple metadata files at one directory level (rule 4)

{{ MACROS___make_filetree_example(
{
"sub-01": {
"ses-test":{
"anat": {
"sub-01_ses-test_T1w.nii.gz": "",
},
"func": {
"sub-01_ses-test_task-overtverbgeneration_run-1_bold.nii.gz": "",
"sub-01_ses-test_task-overtverbgeneration_run-2_bold.nii.gz": "",
"sub-01_ses-test_task-overtverbgeneration_bold.json": "",
"sub-01_ses-test_task-overtverbgeneration_run-2_bold.json": "",
}
}
}
}
) }}

Example 3: Modification of filesystem structure from Example 2 to satisfy inheritance
principle requirements

{{ MACROS___make_filetree_example(
{
"sub-01": {
"ses-test":{
"sub-01_ses-test_task-overtverbgeneration_bold.json": "",
"anat": {
"sub-01_ses-test_T1w.nii.gz": "",
},
"func": {
"sub-01_ses-test_task-overtverbgeneration_run-1_bold.nii.gz": "",
"sub-01_ses-test_task-overtverbgeneration_run-2_bold.nii.gz": "",
"sub-01_ses-test_task-overtverbgeneration_run-2_bold.json": "",
}
}
}
}
) }}

Example 4: Single metadata file applying to multiple data files (corollary 2)

{{ MACROS___make_filetree_example(
{
"sub-01": {
"anat": {},
"func": {
"sub-01_task-xyz_acq-test1_run-1_bold.nii.gz": "",
"sub-01_task-xyz_acq-test1_run-2_bold.nii.gz": "",
"sub-01_task-xyz_acq-test1_bold.json": "",
}
}
}
) }}

## Participant names and other labels

BIDS allows for custom user-defined `<label>`s and `<index>`es for example,
Expand Down