The HALC paper is accepted for publication in BMC Bioinformatics!
HALC is software that makes error correction for long reads with high throughput.
HALC is under the Artistic License 2.0.
-
System requirements
HALC is suitable for 32-bit or 64-bit machines with Linux operating systems. At least 4GB of system memory is recommended for correcting larger data sets.
-
Installation
Aligner BLASR and error correction software LoRDEC (only for -ordinary mode) are required to run HALC.
- The source files in 'src' and 'thirdparty' folders can be compiled to generate a 'bin' folder by running Makefile:
make all
. - Put BLASR, LoRDEC and the 'bin' folder to your $PATH:
export PATH=PATH2BLASR:$PATH
,export PATH=PATH2LoRDEC:$PATH
andexport PATH=PATH2bin:$PATH
, respectively.
- The source files in 'src' and 'thirdparty' folders can be compiled to generate a 'bin' folder by running Makefile:
-
Inputs
- Long reads in FASTA format.
- Contigs assembled from the corresponding short reads in FASTA format.
- The initial short reads in FASTA format (only for -ordinary mode; obtained with
cat left_reads.fa >short_reads.fa
and thencat right_reads.fa >>short_reads.fa
).
-
Using AlignGraph
runHALC.py long_reads.fa contigs.fa [-options|-options]
Options (default value):
-o/-ordinary short_reads.fa (yes)
Ordinary mode utilizing repeats to make correction. The error correction software LoRDEC and the initial short reads are required to refine the repeat corrected regions. It is exclusive with the -repeat-free option.
-r/-repeat-free (no)
Repeat-free mode without utilizing repeats to make correction. It is exclusive with the -ordinary option.
-b/-boundary n (4)
Maximum boundary difference to split the subcontigs.
-a/-accurate (yes)
Accurate construction of the contig graph.
-c/-coverage n (auto)
Expected coverage on contigs. If not specified, it can be automatically calculated.
-w/-width n (4)
Maximum width of the dynamic programming table.
-k/-kmer n (25)
Kmer length for LoRDEC refinement.
-t/-threads n (auto)
Number of threads for one process to create. It is automatically set to the number of computing cores.
-l/-log (no)
System log to print. -
Outputs
- Error corrected full long reads.
- Error corrected trimmed long reads.
- Error corrected split long reads.
HALC's Chinese name is 浩克.