Skip to content

CIDR exome and targeted resequencing joint calling/filtering pipeline

Notifications You must be signed in to change notification settings

Kurt-Hetrick/CIDR_SEQ_CAPTURE_JOINT_CALL

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

CIDR_SEQ_CAPTURE_JOINT_CALL

  • shell script wrappers to submit joint calling for various CIDR projects
    • CMG_VQSR_SUBMITTER.sh: for submitting the CMG grant for JHU.
      • VQSR tranche cut-offs are 99.9%
      • annotates loci with less than 10 variant chromosomes with the Samples tag in INFO with the SM TAG.
      • adds samples in sample sheet to pre-existing list of gvcf files to call new plus old samples together.
      • only extracts the new samples in the sample sheet from the MS VCF file -runs annovar for each sample in sample sheet
    • STD_VQSR_SUBMITTER.sh: for standard sized targeted resequencing projects.
      • examples are exome projects that have more than 20 samples.
      • generates MS vcf for after VQSR and after genotype refinement
      • runs annovar on the MS vcf
      • VQSR tranche cut-offs are 99.0% for indels and 99.5 for SNPs
      • Performs GATK's CalculateGenotypePosteriors on the post VQSR file
    • HOLLAND_VQSR_SUBMITTER.sh: for submitting the Holland NIAID grant
      • VQSR tranche cut-offs are 99.9%
      • adds samples in sample sheet to pre-existing list of gvcf files to call new plus old samples together.
      • only extracts the new samples in the sample sheet from the MS VCF file
      • Performs GATK's CalculateGenotypePosteriors on the post VQSR file
    • HARD_FILTER_ALL_SUBMITTER.sh
      • Performs GATK best practices hard filters (when data size is too small for VQSR)
      • Performs GATK's CalculateGenotypePosteriors on the post VQSR file
    • CMG_VQSR_SUBMITTER_GRCH38.sh: for submitting CMG grant for JHU on GRCh38
      • same details as CMG_VQSR_SUBMITTER.sh
    • HARD_FILTER_ALL_SUBMITTER_GRCH38.sh
      • same details as HARD_FILTER_ALL_SUBMITTER.sh

Example Execution Commands

  • you provide 3 arguments after calling the script
    1. The project folder name where you want the multi-sample vcf written to. Just the project name not the full path.
    2. The full path to the sample sheet
    3. The multi-sample file name prefix that you want.

/mnt/research/tools/LINUX/00_GIT_REPO_KURT/CIDR_SEQ_CAPTURE_JOINT_CALL/CMG_VQSR_SUBMITTER.sh PROJECT_FOLDER_NAME_WHERE_MS_VCF_IS_WRITTEN_TO /PATH/TO/SAMPLESHEET.csv MS_VCF_FILE_NAME_PREFIX | bash

  • Note the pipe to bash at the end of the command for execute the qsub command generated by the submitter script
  • Chose the appropriate script for whatever project you are calling (example above is using the one specific for the CMG grant)

References

About

CIDR exome and targeted resequencing joint calling/filtering pipeline

Resources

Stars

Watchers

Forks

Packages

No packages published

Languages