Skip to content

data driven cutoffs

Brent Pedersen edited this page Nov 11, 2020 · 5 revisions

slivar ddc can be used to find which cutoffs can be used to filter a VCF. It allows looking at individual FILTER fields and at INFO and FORMAT fields. It uses transmitted variants and mendelian violations in trios as a proxy for true and false positives: a good filter will retain most transmitted variants and remove many violations.

Run as:

slivar ddc \
    --chrom "chr15" \
    --info-fields 'LCR,BaseQRankSum,FS,VQSLOD' \
    --fmt-fields 'AB,GQ' $vcf $ped

where the INFO fields (--info-fields) must be Flag, Float, or Integer fields with Number of 1 (or 0 for Flag). The $vcf must contain trios specified in the pedigree/fam file in $ped.

*NOTE that --chrom must be specified. For exomes, use --chrom "*" to use the entire file. For genomes, the entire file will be too much data to render in the HTML output.

The gif below shows usage:

Clone this wiki locally