-
Notifications
You must be signed in to change notification settings - Fork 27
Home
Genscale Team edited this page Jun 9, 2017
·
7 revisions
You can use the GATB-Core library to develop new NGS data analysis softwares.
GATB-Core natively provides the following high-performance and memory-efficient operations
Reads handling:
- FASTA/FASTQ parsing and writing (plain text and gzipped files are supported)
- Parallel iteration of sequences
K-mer:
- K-mer counting
- Minimizer computation of k-mers, partitioning of datasets by minimizers
- Bloom data structure of k-mers
- Hash table of k-mers
- Minimal perfect hash function of k-mers
- Arbitrarily large k-mers representations
de Bruijn graph:
- graph construction
- graph traversal operations (contigs, unitigs)
- graph simplifications for assembly (tip removal, bulge removal)
Other optimized data structures
In addition to the de Bruijn graph data structure, GATB-Core provides several other ones that can be of interest for general purpose developments. These are:
- Open-Addressing Hash Table
- Linked-List Hash Table
- Bloom Filters. There are several flavors: basic, cache-optimized, optimized for k-mer neighbors; accessible through BloomFactory.
- Minimal Perfect Hash Function (BBHash)
The GATB-CORE library is intended to be used by developers having skills in c++ programming.
We also provide a Python 3 wrapper to GATB-Core c++ APIs: pyGATB.
Start your discovery of the library with:
In addition, feel free to contact the GATB-Core devel team if you have any questions regarding the use of the library.