From 578d37c0bf5b1bfecb673a6751d1fd32ce821b08 Mon Sep 17 00:00:00 2001 From: "Christopher J. Markiewicz" Date: Wed, 15 Jan 2020 14:32:26 -0500 Subject: [PATCH 1/8] ENH: Reference links to make table smaller, less fragile --- .../08-genetic-descriptor.md | 20 +++++++++++-------- 1 file changed, 12 insertions(+), 8 deletions(-) diff --git a/src/04-modality-specific-files/08-genetic-descriptor.md b/src/04-modality-specific-files/08-genetic-descriptor.md index a04616e5fc..aaa463e5a6 100644 --- a/src/04-modality-specific-files/08-genetic-descriptor.md +++ b/src/04-modality-specific-files/08-genetic-descriptor.md @@ -59,14 +59,14 @@ sub-patient01 33 F patient 489634 no This file is the descriptor of the genetic information available either in the participant tsv file and/or the genetic database described in the dataset_description.json. The `GeneticLevel` and `SampleOrigin` are the only two mandatory fields. -| Field name | Definition | Values | -| :----------------- | :------------------------------------------------------------------------------- | :---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -| GeneticLevel | MANDATORY Describes the level of analysis | `Genetic`, `Genomic`, `Epigenomic`, `Transcriptomic`, `Metabolomic`, or `Proteomic` | -| AnalyticalApproach | OPTIONAL Methodology used to analyse the GeneticLevel | Value must be taken from [gapsolr](https://www.ncbi.nlm.nih.gov/projects/gapsolr/facets.html) under /Study/Molecular Data Type, for instance `SNP Genotypes (Array)` or `Methylation (CpG)` | -| SampleOrigin | MANDATORY Describes from which tissue the genetic information was extracted from | `blood`, `saliva`, `brain`, `csf`, `breast milk`, `bile`, `amniotic fluid`, `other biospecimen` | -| TissueOrigin | OPTIONAL Describes the type of tissue analyzed for SampleOrigin `brain` | `gray matter`, `white matter`, `csf`, `meninges`, `macrovascular` or `microvascular` | -| BrainLocation | OPTIONAL Refers to the location in space of the TissueOrigin | `MNI coordinate` or a `label` taken from the [Allen Brain Atlas](http://atlas.brain-map.org/atlas?atlas=265297125) possibly `layer` to refer to layer-specific gene expression, which can also tie up with laminar fMRI | -| CellType | OPTIONAL Describes the type of cell analyzed | Value should come from the [cell ontology](http://obofoundry.org/ontology/cl.html) | +| Field name | Definition | Values | +| :----------------- | :-------------------------------------------------------------------------- | :---------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| GeneticLevel | MANDATORY Describes the level of analysis | `Genetic`, `Genomic`, `Epigenomic`, `Transcriptomic`, `Metabolomic`, or `Proteomic` | +| AnalyticalApproach | OPTIONAL Methodology used to analyse the GeneticLevel | Value must be taken from [gapsolr] under /Study/Molecular Data Type, for instance `SNP Genotypes (Array)` or `Methylation (CpG)` | +| SampleOrigin | MANDATORY Describes from which tissue the genetic information was extracted | `blood`, `saliva`, `brain`, `csf`, `breast milk`, `bile`, `amniotic fluid`, `other biospecimen` | +| TissueOrigin | OPTIONAL Describes the type of tissue analyzed for SampleOrigin `brain` | `gray matter`, `white matter`, `csf`, `meninges`, `macrovascular` or `microvascular` | +| BrainLocation | OPTIONAL Refers to the location in space of the TissueOrigin | `MNI coordinate` or a `label` taken from the [Allen Brain Atlas] possibly `layer` to refer to layer-specific gene expression, which can also tie up with laminar fMRI | +| CellType | OPTIONAL Describes the type of cell analyzed | Value should come from the [cell ontology] | `genetic_info.json` example: @@ -79,3 +79,7 @@ This file is the descriptor of the genetic information available either in the p "BrainLocation": "[-30 -15 10]" } ``` + +[Allen Brain Atlas]: http://atlas.brain-map.org/atlas?atlas=265297125&plate=112360888&structure=4392&x=40348.15104166667&y=46928.75&zoom=-7&resolution=206.60&z=3 +[cell ontology]: http://obofoundry.org/ontology/cl.html +[gapsolr]: https://www.ncbi.nlm.nih.gov/projects/gapsolr/facets.html From aa012868c498c550f6bb125b66662b8d3a4479f4 Mon Sep 17 00:00:00 2001 From: "Christopher J. Markiewicz" Date: Wed, 15 Jan 2020 14:33:01 -0500 Subject: [PATCH 2/8] ENH: Update wording, spacing --- .../08-genetic-descriptor.md | 26 ++++++++++++------- 1 file changed, 16 insertions(+), 10 deletions(-) diff --git a/src/04-modality-specific-files/08-genetic-descriptor.md b/src/04-modality-specific-files/08-genetic-descriptor.md index aaa463e5a6..fbb000011b 100644 --- a/src/04-modality-specific-files/08-genetic-descriptor.md +++ b/src/04-modality-specific-files/08-genetic-descriptor.md @@ -1,18 +1,24 @@ # Genetic Descriptor -Support for genetic descriptors was developed as a [BIDS Extension Proposal](https://github.com/bids-standard/bids-specification/blob/master/src/06-extensions.md#bids-extension-proposals). -The extension was primarily developped by Cyril Pernet and Clara Moreau with contributions from Tom Nichols and Jessica Turner. +Support for genetic descriptors was developed as a [BIDS Extension +Proposal](../06-extensions.md#bids-extension-proposals). +The extension was primarily developped by Cyril Pernet and Clara Moreau with +contributions from Tom Nichols and Jessica Turner. -The goal of the genetic descriptor is to link imaging and genetic data. -This is necessary as genetic data are typically stored in dedicated repositories, separately from the imaging data. -The descriptor provides basics information about: -- where to find genetic information associated with the imaging data -- what type of genetic information is available +Genetic data are typically stored in dedicated repositories, +separate from imaging data. +A genetic descriptor links a BIDS dataset to associated genetic data, +potentially in a separate repository, +with details of where to find the genetic data and the type of data available. -## dataset_description.json +## Dataset Description -In order to link a genetic database entry, the key `Genetics` MUST be present and -the value is an object with the following fields: +Genetic descriptors are encoded as an additional, OPTIONAL entry in the +[`dataset_description.json`](../03-modality-agnostic-files.md#dataset_descriptionjson) +file. + +Datasets linked to a genetic database entry include the following REQUIRED or OPTIONAL +`dataset_description.json` keys (a dot in the key name denotes a key in a subdictionary): | Field name | Definition | |----------------------|--------------------------------------------------------------------------------| From 8b385b33f21744af8c0f25400425e8e63372de78 Mon Sep 17 00:00:00 2001 From: "Christopher J. Markiewicz" Date: Wed, 15 Jan 2020 14:33:30 -0500 Subject: [PATCH 3/8] ENH: Fill out and fix formatting for examples --- .../08-genetic-descriptor.md | 26 ++++++++++--------- 1 file changed, 14 insertions(+), 12 deletions(-) diff --git a/src/04-modality-specific-files/08-genetic-descriptor.md b/src/04-modality-specific-files/08-genetic-descriptor.md index fbb000011b..159fffa509 100644 --- a/src/04-modality-specific-files/08-genetic-descriptor.md +++ b/src/04-modality-specific-files/08-genetic-descriptor.md @@ -26,18 +26,19 @@ Datasets linked to a genetic database entry include the following REQUIRED or OP | Genetics.Database | OPTIONAL. URI of database where the dataset is hosted. | | Genetics.Descriptors | OPTIONAL. List of relevant descriptors (*e.g.*, journal articles) for dataset. | -`dataset_description.json` example: +Example: + ```JSON { - "Name": "Human Connectom Project", - "BIDSVersion": "1.2.0", + "Name": "Human Connectome Project", + "BIDSVersion": "1.3.0", "License": "CC0", "Authors": ["1st author", "2nd author"], - "Funding": "list your funding sources", + "Funding": ["P41 EB015894/EB/NIBIB NIH HHS/United States"], "Genetics": { - "Dataset": "dataset", - "Database": "database", - "Descriptors": "descriptors" + "Dataset": "https://www.ncbi.nlm.nih.gov/projects/gap/cgi-bin/study.cgi?study_id=phs001364.v1.p1", + "Database": "https://www.ncbi.nlm.nih.gov/gap/", + "Descriptors": ["https://doi.org/10.1016/j.neuroimage.2013.05.041"] } } ``` @@ -55,10 +56,10 @@ in the `participants.tsv` file by adding optional columns. `participants.tsv` example: ```Text -participant_id age sex group GeneticID IDH Mutation -sub-control01 34 M control 124587 yes -sub-control02 12 F control 548936 yes -sub-patient01 33 F patient 489634 no +participant_id age sex group GeneticID IDH Mutation +sub-control01 34 M control 124587 yes +sub-control02 12 F control 548936 yes +sub-patient01 33 F patient 489634 no ``` ## genetic_info.json @@ -79,7 +80,8 @@ This file is the descriptor of the genetic information available either in the p ```JSON { "GeneticLevel": "Genetic", - "AnalyticalApproach": "SNP Genotypes", "SampleOrigin": "brain", + "AnalyticalApproach": "SNP Genotypes", + "SampleOrigin": "brain", "TissueOrigin": "gray matter", "CellType": "neuron", "BrainLocation": "[-30 -15 10]" From 918f41fb9ffa702543320033a73ad166950f2236 Mon Sep 17 00:00:00 2001 From: "Christopher J. Markiewicz" Date: Wed, 15 Jan 2020 14:46:13 -0500 Subject: [PATCH 4/8] ENH: Update genetic information section --- .../08-genetic-descriptor.md | 15 +++++++++++++-- 1 file changed, 13 insertions(+), 2 deletions(-) diff --git a/src/04-modality-specific-files/08-genetic-descriptor.md b/src/04-modality-specific-files/08-genetic-descriptor.md index 159fffa509..07d51c1332 100644 --- a/src/04-modality-specific-files/08-genetic-descriptor.md +++ b/src/04-modality-specific-files/08-genetic-descriptor.md @@ -62,9 +62,20 @@ sub-control02 12 F control 548936 yes sub-patient01 33 F patient 489634 no ``` -## genetic_info.json +## Genetic Information -This file is the descriptor of the genetic information available either in the participant tsv file and/or the genetic database described in the dataset_description.json. The `GeneticLevel` and `SampleOrigin` are the only two mandatory fields. +Template: + +```Text +genetic_info.json +``` + +The `genetic_info.json` file describes the genetic information available in the +`participants.tsv` file and/or the genetic database described in +`dataset_description.json`. +Datasets containing the `Genetics` field in `dataset_description.json` or the +`GeneticID` column in `participants.tsv` MUST include this file with the following +fields: | Field name | Definition | Values | | :----------------- | :-------------------------------------------------------------------------- | :---------------------------------------------------------------------------------------------------------------------------------------------------------------------| From 3d4112ffd5eb0595d75a38cccb568edc585a607d Mon Sep 17 00:00:00 2001 From: "Christopher J. Markiewicz" Date: Wed, 15 Jan 2020 14:48:15 -0500 Subject: [PATCH 5/8] ENH: snake_case TSV columns for consistency --- src/04-modality-specific-files/08-genetic-descriptor.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/src/04-modality-specific-files/08-genetic-descriptor.md b/src/04-modality-specific-files/08-genetic-descriptor.md index 07d51c1332..124d26227e 100644 --- a/src/04-modality-specific-files/08-genetic-descriptor.md +++ b/src/04-modality-specific-files/08-genetic-descriptor.md @@ -56,7 +56,7 @@ in the `participants.tsv` file by adding optional columns. `participants.tsv` example: ```Text -participant_id age sex group GeneticID IDH Mutation +participant_id age sex group genetic_id idh_mutation sub-control01 34 M control 124587 yes sub-control02 12 F control 548936 yes sub-patient01 33 F patient 489634 no @@ -74,7 +74,7 @@ The `genetic_info.json` file describes the genetic information available in the `participants.tsv` file and/or the genetic database described in `dataset_description.json`. Datasets containing the `Genetics` field in `dataset_description.json` or the -`GeneticID` column in `participants.tsv` MUST include this file with the following +`genetic_id` column in `participants.tsv` MUST include this file with the following fields: | Field name | Definition | Values | From 4f27cee286a9313854f0c929705bf83c24b40e92 Mon Sep 17 00:00:00 2001 From: "Christopher J. Markiewicz" Date: Tue, 21 Jan 2020 13:52:58 -0500 Subject: [PATCH 6/8] ENH: Add TOC entry --- mkdocs.yml | 1 + 1 file changed, 1 insertion(+) diff --git a/mkdocs.yml b/mkdocs.yml index c51148dccb..3d18837b76 100644 --- a/mkdocs.yml +++ b/mkdocs.yml @@ -31,6 +31,7 @@ nav: - Task events: 04-modality-specific-files/05-task-events.md - Physiological and other continuous recordings: 04-modality-specific-files/06-physiological-and-other-continuous-recordings.md - Behavioral experiments (with no MRI): 04-modality-specific-files/07-behavioral-experiments.md + - Genetic Descriptor: 04-modality-specific-files/08-genetic-descriptor.md - Longitudinal and multi-site studies: 05-longitudinal-and-multi-site-studies.md - Extending the BIDS specification: 06-extensions.md - Appendix: From 1b9acb943891ad2b1a474d7b9ca25b2068b4e45d Mon Sep 17 00:00:00 2001 From: Chris Markiewicz Date: Tue, 28 Jan 2020 12:50:50 -0500 Subject: [PATCH 7/8] FIX: Referencce styling --- .../08-genetic-descriptor.md | 20 +++++++++---------- 1 file changed, 10 insertions(+), 10 deletions(-) diff --git a/src/04-modality-specific-files/08-genetic-descriptor.md b/src/04-modality-specific-files/08-genetic-descriptor.md index 124d26227e..a3166d5e1a 100644 --- a/src/04-modality-specific-files/08-genetic-descriptor.md +++ b/src/04-modality-specific-files/08-genetic-descriptor.md @@ -77,14 +77,14 @@ Datasets containing the `Genetics` field in `dataset_description.json` or the `genetic_id` column in `participants.tsv` MUST include this file with the following fields: -| Field name | Definition | Values | -| :----------------- | :-------------------------------------------------------------------------- | :---------------------------------------------------------------------------------------------------------------------------------------------------------------------| -| GeneticLevel | MANDATORY Describes the level of analysis | `Genetic`, `Genomic`, `Epigenomic`, `Transcriptomic`, `Metabolomic`, or `Proteomic` | -| AnalyticalApproach | OPTIONAL Methodology used to analyse the GeneticLevel | Value must be taken from [gapsolr] under /Study/Molecular Data Type, for instance `SNP Genotypes (Array)` or `Methylation (CpG)` | -| SampleOrigin | MANDATORY Describes from which tissue the genetic information was extracted | `blood`, `saliva`, `brain`, `csf`, `breast milk`, `bile`, `amniotic fluid`, `other biospecimen` | -| TissueOrigin | OPTIONAL Describes the type of tissue analyzed for SampleOrigin `brain` | `gray matter`, `white matter`, `csf`, `meninges`, `macrovascular` or `microvascular` | -| BrainLocation | OPTIONAL Refers to the location in space of the TissueOrigin | `MNI coordinate` or a `label` taken from the [Allen Brain Atlas] possibly `layer` to refer to layer-specific gene expression, which can also tie up with laminar fMRI | -| CellType | OPTIONAL Describes the type of cell analyzed | Value should come from the [cell ontology] | +| Field name | Definition | Values | +| :----------------- | :-------------------------------------------------------------------------- | :--------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | +| GeneticLevel | MANDATORY Describes the level of analysis | `Genetic`, `Genomic`, `Epigenomic`, `Transcriptomic`, `Metabolomic`, or `Proteomic` | +| AnalyticalApproach | OPTIONAL Methodology used to analyse the GeneticLevel | Value must be taken from [gapsolr][] under /Study/Molecular Data Type, for instance `SNP Genotypes (Array)` or `Methylation (CpG)` | +| SampleOrigin | MANDATORY Describes from which tissue the genetic information was extracted | `blood`, `saliva`, `brain`, `csf`, `breast milk`, `bile`, `amniotic fluid`, `other biospecimen` | +| TissueOrigin | OPTIONAL Describes the type of tissue analyzed for SampleOrigin `brain` | `gray matter`, `white matter`, `csf`, `meninges`, `macrovascular` or `microvascular` | +| BrainLocation | OPTIONAL Refers to the location in space of the TissueOrigin | `MNI coordinate` or a `label` taken from the [Allen Brain Atlas][allen] possibly `layer` to refer to layer-specific gene expression, which can also tie up with laminar fMRI | +| CellType | OPTIONAL Describes the type of cell analyzed | Value should come from the [cell ontology][ontology] | `genetic_info.json` example: @@ -99,6 +99,6 @@ fields: } ``` -[Allen Brain Atlas]: http://atlas.brain-map.org/atlas?atlas=265297125&plate=112360888&structure=4392&x=40348.15104166667&y=46928.75&zoom=-7&resolution=206.60&z=3 -[cell ontology]: http://obofoundry.org/ontology/cl.html +[allen]: http://atlas.brain-map.org/atlas?atlas=265297125&plate=112360888&structure=4392&x=40348.15104166667&y=46928.75&zoom=-7&resolution=206.60&z=3 +[ontology]: http://obofoundry.org/ontology/cl.html [gapsolr]: https://www.ncbi.nlm.nih.gov/projects/gapsolr/facets.html From 8401a14c9d3c29a39ac41278c0ee6d796018db6b Mon Sep 17 00:00:00 2001 From: Chris Markiewicz Date: Fri, 31 Jan 2020 11:37:12 -0500 Subject: [PATCH 8/8] Permit AnalyticalApproach to be a string or list of strings. With example provided by Cyril Pernet. --- src/04-modality-specific-files/08-genetic-descriptor.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/src/04-modality-specific-files/08-genetic-descriptor.md b/src/04-modality-specific-files/08-genetic-descriptor.md index a3166d5e1a..f4bca62081 100644 --- a/src/04-modality-specific-files/08-genetic-descriptor.md +++ b/src/04-modality-specific-files/08-genetic-descriptor.md @@ -80,7 +80,7 @@ fields: | Field name | Definition | Values | | :----------------- | :-------------------------------------------------------------------------- | :--------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | GeneticLevel | MANDATORY Describes the level of analysis | `Genetic`, `Genomic`, `Epigenomic`, `Transcriptomic`, `Metabolomic`, or `Proteomic` | -| AnalyticalApproach | OPTIONAL Methodology used to analyse the GeneticLevel | Value must be taken from [gapsolr][] under /Study/Molecular Data Type, for instance `SNP Genotypes (Array)` or `Methylation (CpG)` | +| AnalyticalApproach | OPTIONAL Methodology or methodologies used to analyse the GeneticLevel | String or list of strings. Each Value must be taken from [gapsolr][] under /Study/Molecular Data Type, for instance `SNP Genotypes (Array)` or `Methylation (CpG)` | | SampleOrigin | MANDATORY Describes from which tissue the genetic information was extracted | `blood`, `saliva`, `brain`, `csf`, `breast milk`, `bile`, `amniotic fluid`, `other biospecimen` | | TissueOrigin | OPTIONAL Describes the type of tissue analyzed for SampleOrigin `brain` | `gray matter`, `white matter`, `csf`, `meninges`, `macrovascular` or `microvascular` | | BrainLocation | OPTIONAL Refers to the location in space of the TissueOrigin | `MNI coordinate` or a `label` taken from the [Allen Brain Atlas][allen] possibly `layer` to refer to layer-specific gene expression, which can also tie up with laminar fMRI | @@ -91,7 +91,7 @@ fields: ```JSON { "GeneticLevel": "Genetic", - "AnalyticalApproach": "SNP Genotypes", + "AnalyticalApproach": ["Whole Genome Sequencing", "SNP/CNV Genotypes"], "SampleOrigin": "brain", "TissueOrigin": "gray matter", "CellType": "neuron",