Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generating CDS template #409

Merged
merged 18 commits into from
Jun 3, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 8 additions & 1 deletion HTAN.model.csv
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,13 @@ Patient,HTAN patient,,"Component, HTAN Participant ID",,FALSE,Individual Organis
File,A type of Information Content Entity specific to OS,,,,FALSE,Information Content Entity,,https://w3id.org/biolink/vocab/DataFile,
Filename,Name of a file,,,,TRUE,,,,regex search ^.+\/\S*$
File Format,"Format of a file (e.g. txt, csv, fastq, bam, etc.)","hdf5, bedgraph, idx, idat, bam, bai, excel, powerpoint, tif, tiff, OME-TIFF, png, doc, pdf, fasta, fastq, sam, vcf, bcf, maf, bed, chp, cel, sif, tsv, csv, txt, plink, bigwig, wiggle, gct, bgzip, zip, seg, html, mov, hyperlink, svs, md, flagstat, gtf, raw, msf, rmd, bed narrowPeak, bed broadPeak, bed gappedPeak, avi, pzfx, fig, xml, tar, R script, abf, bpm, dat, jpg, locs, Sentrix descriptor file, Python script, sav, gzip, sdf, RData, hic, ab1, 7z, gff3, json, sqlite, svg, sra, recal, tranches, mtx, tagAlign, dup, DICOM, czi, mex, cloupe, am, cell am, mpg, m, mzML,scn, dcc, rcc, pkc, sf, bedpe",,,TRUE,,,,
CDS Sequencing Template,"CDS compatible template file, includes attributes for Genomic Reference, Library Layout, Data Type, Sequencing Platform, Library Selection Method",,"Component, Filename, File Format, HTAN Data File ID, HTAN Parent Biospecimen ID, CDS Genomic Reference, CDS Library Layout, CDS Data Type, CDS Sequencing Platform, CDS Library Selection Method",,TRUE,,,,
CDS Genomic Reference,One or more characters used to identify the published NCBI genetic sequence that is used as a reference against which other sequences are compared.,,,,TRUE,Sequencing,,https://dataservice.datacommons.cancer.gov/#/resources,str
CDS Library Layout,The read strategy or method that was used for sequencing and analysis of a nucleotide library.,"Paired End, Single Read",,,TRUE,Sequencing,,https://dataservice.datacommons.cancer.gov/#/resources,
CDS Data Type,"Types of data associated with the content. Fill out Other Data Type Specified, if not on the list.","10x Visium Spatial Transcriptomics, Bulk Methylation-seq, Bulk RNA-seq, Bulk WES, Electron Microscopy, ExSeq, HI-C-seq, RPPA, Imaging, Mass Spectrometry, NanoString GeoMx DSP Spatial Transcriptomics, Other Assay, SRRS Imaging, Slide-seq, scATAC-seq, scDNA-seq, scRNA-seq, Accessory Manifest, Other Data Type Specified",,,TRUE,Publication,,https://dataservice.datacommons.cancer.gov/#/resources,list like
CDS Sequencing Platform,The words used to describe the instrument used to carry out a high-throughput sequencing experiment.,"Illumina Next Seq 500, Illumina Next Seq 550, Illumina Next Seq 2500, Illumina NovaSeq 6000, Illumina MiSeq, 454 GS FLX Titanium, AB SOLiD 4, AB SOLiD 2, AB SOLiD 3, Complete Genomics, Illumina HiSeq X Ten, Illumina HiSeq X Five, Illumina Genome Analyzer II, Illumina Genome Analyzer IIx, Illumina HiSeq 2000, Illumina HiSeq 2500, Illumina HiSeq 4000, Illumina MiSeq, Illumina NextSeq, Ion Torrent PGM, Ion Torrent Proton, Ion Torrent S5, PacBio RS, NovaSeq 6000, NovaSeqS4, Ultima Genomics UG100, Oxford Nanopore minION, GridION, PromethION, PacBio Sequel2, Revio, Illumina NextSeq 1000, Illumina NextSeq 2000, Other, unknown, Not Reported",,,TRUE,Device,,https://dataservice.datacommons.cancer.gov/#/resources,
CDS Library Selection Method,The type of systematic actions performed to select or enrich DNA fragments used in analysis by high-throughput sequencing.,"Random, rRNA Depletion, Other",,,TRUE,Sequencing,,https://dataservice.datacommons.cancer.gov/#/resources,
CDS Other Data Type Specified,Other types of data associated with the content.,,CDS Data Type,,FALSE,Sequencing,,,
Checksum,MD5 checksum of the BAM file,,,,TRUE,Information Content Entity,,,
HTAN Data File ID,Self-identifier for this data file - HTAN ID of this file HTAN ID SOP (eg HTANx_yyy_zzz),,,,TRUE,File,,https://docs.google.com/document/d/1podtPP8L1UNvVxx9_c_szlDcU1f8n7bige6XA_GoRVM/edit?usp=sharing,regex match ^(HTA([1-9]|1[0-6]))_((EXT)?([0-9]\d*|0000))_([0-9]\d*|0000)$ warning
HTAN Participant ID,HTAN ID associated with a patient based on HTAN ID SOP (eg HTANx_yyy ),,,,TRUE,Patient,,https://docs.google.com/document/d/1podtPP8L1UNvVxx9_c_szlDcU1f8n7bige6XA_GoRVM/edit?usp=sharing,regex match ^(HTA([1-9]|1[0-6]))_((EXT)?([0-9]\d*|0000))$ warning
Expand Down Expand Up @@ -1032,7 +1039,7 @@ Barretts Esophagus Goblet Cells Present,Presence or absennce of Barretts esophag
Pancreatitis Onset Year,Date of onset of pancreatitis.,,,,FALSE,Follow Up,,,num
HTAN Parent Channel Metadata ID,HTAN ID for a level 3 channels table.,,,,TRUE, Imaging Level 4,,,
Single Nucleus Capture,Nuclei isolation method,"Plates, 10x, droplet",,,FALSE,scmC-seq Level 1,,,
Microarray Platform ID,"The NCBI GEO Microarray Platform ID that links to the table containing the array definition",,,,TRUE,Microarray Level 1,,,regex match GPL\d+
Microarray Platform ID,The NCBI GEO Microarray Platform ID that links to the table containing the array definition,,,,TRUE,Microarray Level 1,,,regex match GPL\d+
Microarray Molecule,Microarray is measuring this kind of molecule,"DNA, RNA",,,TRUE,Microarray Level 1,,,
Microarray Label,Microarray used this kind of label,,,,TRUE,Microarray Level 1,,,
Microarray Value Definition,What the provided value signifies,,,,TRUE,Microarray Level 1,,,
Expand Down
Loading