Skip to content

Latest commit

 

History

History
56 lines (40 loc) · 2.13 KB

README.md

File metadata and controls

56 lines (40 loc) · 2.13 KB

Comparing DToL alternate haplotypes

Scope

Working on a VCF (or BCF, gzipped or not), calculate SNP density in windows. Very much in progress. May be buggy.

I had a really (really) hard time thinking about how to split a VCF into windows (actually non-overlapping chunks), while passing the file only once. My approach definitely needs some quality control, but seems to match up to bedtools (albeit limited testing).

Current usage

You have to complile yourself. Download rust, clone this repo, and then run:

cargo build --release

Compiling may take a couple of minutes. This will then make the compiled binary in the target/release directory.

Run ./target/release/fasta_windows --help to display the help message in the terminal.

Haplotype windows 0.1.1
Max Brown <[email protected]>
Haplotype stats in windows on a VCF file.

USAGE:
    haplotype_windows [OPTIONS] --output <output> --vcf <vcf>

FLAGS:
    -h, --help       Prints help information
    -V, --version    Prints version information

OPTIONS:
    -o, --output <output>              Output filename for the CSVs (without extension).
    -p, --pass_only <pass_only>        Should calculations be output only on records with as PASS filter? Boolean, input
                                       true or false. [default: true]
    -f, --vcf <vcf>                    The input vcf (or bcf) file. Gzipped or not.
    -w, --window_size <window_size>    Integer size of window for statistics to be computed over. [default: 10000]

Outputs a CSV:

Currently spits out the numbers of SNPs, insertions, deletions, transitions, transversions, as well as the total number of variants per window.

e.g. ./target/release/haplotype-windows -f <vcf> -o test -p false

ID,window,no_snps,no_insertions,no_deletions,snp_density,transitions,transversions
SUPER_1,10000,34,3,5,42,18,16
SUPER_1,20000,28,2,3,33,17,11
SUPER_1,30000,51,2,1,54,26,25
SUPER_1,40000,175,9,12,196,91,84
SUPER_1,50000,206,22,18,246,116,90
SUPER_1,60000,131,15,13,159,59,72
SUPER_1,70000,62,7,6,75,28,34
SUPER_1,80000,195,20,11,226,109,86
SUPER_1,90000,176,19,8,203,102,74