Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BEP 18 suggestions #398

Merged
merged 8 commits into from
Feb 3, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,7 @@ nav:
- Task events: 04-modality-specific-files/05-task-events.md
- Physiological and other continuous recordings: 04-modality-specific-files/06-physiological-and-other-continuous-recordings.md
- Behavioral experiments (with no MRI): 04-modality-specific-files/07-behavioral-experiments.md
- Genetic Descriptor: 04-modality-specific-files/08-genetic-descriptor.md
- Longitudinal and multi-site studies: 05-longitudinal-and-multi-site-studies.md
- Extending the BIDS specification: 06-extensions.md
- Appendix:
Expand Down
87 changes: 55 additions & 32 deletions src/04-modality-specific-files/08-genetic-descriptor.md
Original file line number Diff line number Diff line change
@@ -1,37 +1,44 @@
# Genetic Descriptor

Support for genetic descriptors was developed as a [BIDS Extension Proposal](https://github.com/bids-standard/bids-specification/blob/master/src/06-extensions.md#bids-extension-proposals).
The extension was primarily developped by Cyril Pernet and Clara Moreau with contributions from Tom Nichols and Jessica Turner.
Support for genetic descriptors was developed as a [BIDS Extension
Proposal](../06-extensions.md#bids-extension-proposals).
The extension was primarily developped by Cyril Pernet and Clara Moreau with
contributions from Tom Nichols and Jessica Turner.

The goal of the genetic descriptor is to link imaging and genetic data.
This is necessary as genetic data are typically stored in dedicated repositories, separately from the imaging data.
The descriptor provides basics information about:
- where to find genetic information associated with the imaging data
- what type of genetic information is available
Genetic data are typically stored in dedicated repositories,
separate from imaging data.
A genetic descriptor links a BIDS dataset to associated genetic data,
potentially in a separate repository,
with details of where to find the genetic data and the type of data available.

## dataset_description.json
## Dataset Description

In order to link a genetic database entry, the key `Genetics` MUST be present and
the value is an object with the following fields:
Genetic descriptors are encoded as an additional, OPTIONAL entry in the
[`dataset_description.json`](../03-modality-agnostic-files.md#dataset_descriptionjson)
file.

Datasets linked to a genetic database entry include the following REQUIRED or OPTIONAL
`dataset_description.json` keys (a dot in the key name denotes a key in a subdictionary):

| Field name | Definition |
|----------------------|--------------------------------------------------------------------------------|
| Genetics.Dataset | REQUIRED. URI where data can be retrieved. |
| Genetics.Database | OPTIONAL. URI of database where the dataset is hosted. |
| Genetics.Descriptors | OPTIONAL. List of relevant descriptors (*e.g.*, journal articles) for dataset. |

`dataset_description.json` example:
Example:

```JSON
{
"Name": "Human Connectom Project",
"BIDSVersion": "1.2.0",
"Name": "Human Connectome Project",
"BIDSVersion": "1.3.0",
"License": "CC0",
"Authors": ["1st author", "2nd author"],
"Funding": "list your funding sources",
"Funding": ["P41 EB015894/EB/NIBIB NIH HHS/United States"],
"Genetics": {
"Dataset": "dataset",
"Database": "database",
"Descriptors": "descriptors"
"Dataset": "https://www.ncbi.nlm.nih.gov/projects/gap/cgi-bin/study.cgi?study_id=phs001364.v1.p1",
"Database": "https://www.ncbi.nlm.nih.gov/gap/",
"Descriptors": ["https://doi.org/10.1016/j.neuroimage.2013.05.041"]
}
}
```
Expand All @@ -49,33 +56,49 @@ in the `participants.tsv` file by adding optional columns.
`participants.tsv` example:

```Text
participant_id age sex group GeneticID IDH Mutation
sub-control01 34 M control 124587 yes
sub-control02 12 F control 548936 yes
sub-patient01 33 F patient 489634 no
participant_id age sex group genetic_id idh_mutation
sub-control01 34 M control 124587 yes
sub-control02 12 F control 548936 yes
sub-patient01 33 F patient 489634 no
```

## genetic_info.json
## Genetic Information

Template:

This file is the descriptor of the genetic information available either in the participant tsv file and/or the genetic database described in the dataset_description.json. The `GeneticLevel` and `SampleOrigin` are the only two mandatory fields.
```Text
genetic_info.json
```

| Field name | Definition | Values |
| :----------------- | :------------------------------------------------------------------------------- | :---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| GeneticLevel | MANDATORY Describes the level of analysis | `Genetic`, `Genomic`, `Epigenomic`, `Transcriptomic`, `Metabolomic`, or `Proteomic` |
| AnalyticalApproach | OPTIONAL Methodology used to analyse the GeneticLevel | Value must be taken from [gapsolr](https://www.ncbi.nlm.nih.gov/projects/gapsolr/facets.html) under /Study/Molecular Data Type, for instance `SNP Genotypes (Array)` or `Methylation (CpG)` |
| SampleOrigin | MANDATORY Describes from which tissue the genetic information was extracted from | `blood`, `saliva`, `brain`, `csf`, `breast milk`, `bile`, `amniotic fluid`, `other biospecimen` |
| TissueOrigin | OPTIONAL Describes the type of tissue analyzed for SampleOrigin `brain` | `gray matter`, `white matter`, `csf`, `meninges`, `macrovascular` or `microvascular` |
| BrainLocation | OPTIONAL Refers to the location in space of the TissueOrigin | `MNI coordinate` or a `label` taken from the [Allen Brain Atlas](http://atlas.brain-map.org/atlas?atlas=265297125) possibly `layer` to refer to layer-specific gene expression, which can also tie up with laminar fMRI |
| CellType | OPTIONAL Describes the type of cell analyzed | Value should come from the [cell ontology](http://obofoundry.org/ontology/cl.html) |
The `genetic_info.json` file describes the genetic information available in the
`participants.tsv` file and/or the genetic database described in
`dataset_description.json`.
Datasets containing the `Genetics` field in `dataset_description.json` or the
`genetic_id` column in `participants.tsv` MUST include this file with the following
fields:

| Field name | Definition | Values |
| :----------------- | :-------------------------------------------------------------------------- | :--------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| GeneticLevel | MANDATORY Describes the level of analysis | `Genetic`, `Genomic`, `Epigenomic`, `Transcriptomic`, `Metabolomic`, or `Proteomic` |
| AnalyticalApproach | OPTIONAL Methodology or methodologies used to analyse the GeneticLevel | String or list of strings. Each Value must be taken from [gapsolr][] under /Study/Molecular Data Type, for instance `SNP Genotypes (Array)` or `Methylation (CpG)` |
| SampleOrigin | MANDATORY Describes from which tissue the genetic information was extracted | `blood`, `saliva`, `brain`, `csf`, `breast milk`, `bile`, `amniotic fluid`, `other biospecimen` |
| TissueOrigin | OPTIONAL Describes the type of tissue analyzed for SampleOrigin `brain` | `gray matter`, `white matter`, `csf`, `meninges`, `macrovascular` or `microvascular` |
| BrainLocation | OPTIONAL Refers to the location in space of the TissueOrigin | `MNI coordinate` or a `label` taken from the [Allen Brain Atlas][allen] possibly `layer` to refer to layer-specific gene expression, which can also tie up with laminar fMRI |
| CellType | OPTIONAL Describes the type of cell analyzed | Value should come from the [cell ontology][ontology] |

`genetic_info.json` example:

```JSON
{
"GeneticLevel": "Genetic",
"AnalyticalApproach": "SNP Genotypes", "SampleOrigin": "brain",
"AnalyticalApproach": ["Whole Genome Sequencing", "SNP/CNV Genotypes"],
"SampleOrigin": "brain",
"TissueOrigin": "gray matter",
"CellType": "neuron",
"BrainLocation": "[-30 -15 10]"
}
```

[allen]: http://atlas.brain-map.org/atlas?atlas=265297125&plate=112360888&structure=4392&x=40348.15104166667&y=46928.75&zoom=-7&resolution=206.60&z=3
[ontology]: http://obofoundry.org/ontology/cl.html
[gapsolr]: https://www.ncbi.nlm.nih.gov/projects/gapsolr/facets.html