The smallgenomeutilities are a collection of scripts that is useful for dealing and manipulating NGS data of small viral genomes. They are written in Python 3 with a small number of dependencies.
- biopython
- numpy
- progress
- pysam
- sklearn
- matplotlib
The recommended way to install the smallgenomeutilities is using pip:
pip install smallgenomeutilities
Compute multidimensional scaling for visualizing distances among reconstructed haplotypes.
Convert QuasiRecomb output of a transmitter and recipient set of haplotypes to a combined set of haplotypes, where gaps have been filtered. Optionally translate to peptide sequence.
Perform a genomic liftover. Transform an alignment in SAM or BAM format from one reference sequence to another. Can replace M states by =/X.
Calculate average coverage for a target region on a different contig.
Calculate average coverage for a target region of an alignment.
Build consensus sequences including either the majority base or the ambiguous bases from an alignment (BAM) file.
Extract regions with sufficient coverage for running ShoRAH. Half-open intervals are returned, [start:end), and 0-based indexing is used.
Extract subsequences of an alignment, with the option of converting it to peptide sequences. Can filter on the basis of subsequence frequency or gap frequencies in subsequences.
Extract sequences of alignments into a FASTA file where the sequence id matches a given string.
Determine the genomic offsets on a target contig, given an initial contig and offsets. Can be used to map between reference genomes.
Extract frequencies of minority variants from multiple samples. A region of interest is also supported.
Compare sequences from a multiple sequence alignment from transmitter and recipient samples in order to determine the optimal matching of transmitters to recipients.
Predict number of reads after quality preprocessing.
Given a multiple sequence alignment, remove loci with a gap fraction above a certain threshold.
- David Seifert <[email protected]>
- Susana Posada Cespedes <[email protected]>