Skip to content

Latest commit

 

History

History
executable file
·
41 lines (36 loc) · 2.4 KB

README.md

File metadata and controls

executable file
·
41 lines (36 loc) · 2.4 KB

CIDR_SEQ_CAPTURE_JOINT_CALL

  • shell script wrappers to submit joint calling for various CIDR projects
    • CMG_VQSR_SUBMITTER.sh: for submitting the CMG grant for JHU.
      • VQSR tranche cut-offs are 99.9%
      • annotates loci with less than 10 variant chromosomes with the Samples tag in INFO with the SM TAG.
      • adds samples in sample sheet to pre-existing list of gvcf files to call new plus old samples together.
      • only extracts the new samples in the sample sheet from the MS VCF file -runs annovar for each sample in sample sheet
    • STD_VQSR_SUBMITTER.sh: for standard sized targeted resequencing projects.
      • examples are exome projects that have more than 20 samples.
      • generates MS vcf for after VQSR and after genotype refinement
      • runs annovar on the MS vcf
      • VQSR tranche cut-offs are 99.0% for indels and 99.5 for SNPs
      • Performs GATK's CalculateGenotypePosteriors on the post VQSR file
    • HOLLAND_VQSR_SUBMITTER.sh: for submitting the Holland NIAID grant
      • VQSR tranche cut-offs are 99.9%
      • adds samples in sample sheet to pre-existing list of gvcf files to call new plus old samples together.
      • only extracts the new samples in the sample sheet from the MS VCF file
      • Performs GATK's CalculateGenotypePosteriors on the post VQSR file
    • HARD_FILTER_ALL_SUBMITTER.sh
      • Performs GATK best practices hard filters (when data size is too small for VQSR)
      • Performs GATK's CalculateGenotypePosteriors on the post VQSR file
    • CMG_VQSR_SUBMITTER_GRCH38.sh: for submitting CMG grant for JHU on GRCh38
      • same details as CMG_VQSR_SUBMITTER.sh
    • HARD_FILTER_ALL_SUBMITTER_GRCH38.sh
      • same details as HARD_FILTER_ALL_SUBMITTER.sh

Example Execution Commands

  • you provide 3 arguments after calling the script
    1. The project folder name where you want the multi-sample vcf written to. Just the project name not the full path.
    2. The full path to the sample sheet
    3. The multi-sample file name prefix that you want.

/mnt/research/tools/LINUX/00_GIT_REPO_KURT/CIDR_SEQ_CAPTURE_JOINT_CALL/CMG_VQSR_SUBMITTER.sh PROJECT_FOLDER_NAME_WHERE_MS_VCF_IS_WRITTEN_TO /PATH/TO/SAMPLESHEET.csv MS_VCF_FILE_NAME_PREFIX | bash

  • Note the pipe to bash at the end of the command for execute the qsub command generated by the submitter script
  • Chose the appropriate script for whatever project you are calling (example above is using the one specific for the CMG grant)

References