Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ENH] Clarify meaning of raw vs derivative datasets #1537

Merged
merged 9 commits into from
Jul 24, 2023
23 changes: 22 additions & 1 deletion src/derivatives/introduction.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,9 +4,30 @@ Derivatives are outputs of common processing pipelines, capturing data and
meta-data sufficient for a researcher to understand and (critically) reuse those
outputs in subsequent processing.
Standardizing derivatives is motivated by use cases where formalized
machine-readable access to processed data enables higher level processing.
machine-readable access to processed data enables higher-level processing.

The following sections cover additions to and divergences from "raw" BIDS.
Raw data are data that have been curated into BIDS from a non-BIDS source.
If a dataset is derived from at least one other valid BIDS dataset, then it is a derivative dataset.

Examples:

A defaced T1w image would typically be made during the curation process and is thus under raw

```Text
sourcedata/private/sub-01/anat/sub-01_T1w.nii.gz
sub-01/anat/sub-01_T1w.nii.gz
```

A defaced T1w image could also, in theory, be derived from a BIDS dataset and would thus be under derivatives

```Text
sub-01/anat/sub-01_T1w.nii.gz
derivatives/sub-01/anat/sub-01_desc-defaced_T1w.nii.gz
```

## Derivatives storage and folders structure
effigies marked this conversation as resolved.
Show resolved Hide resolved

Placement and naming conventions for derived datasets are addressed in
[Storage of derived datasets][storage], and dataset-level metadata is included
in [Derived dataset and pipeline description][derived-dataset-description].
Expand Down