About

A benchmark suite covering the major steps in short and long-read genome sequence analysis pipelines such as basecalling, sequence mapping, de-novo assembly, variant calling and polishing.

Download

Latest source code

git clone --recursive https://github.com/arun-sub/genomicsbench.git

Input datasets

wget https://genomicsbench.eecs.umich.edu/input-datasets.tar.gz

Prerequisites

RHEL/Fedora system prerequisites

sudo yum -y install $(cat rhel.prerequisites)

Debian system prerequisites

sudo apt-get install $(cat debian.prerequisites)

Python setup (optional: only needed for GPU benchmarks)

To run Python-based benchmarks nn-base, nn-variant and abea, follow the steps below:

Download and install miniconda from this link.
Follow the steps below to set up a conda environment:

# make sure channels are added in conda
conda config --add channels defaults
conda config --add channels bioconda
conda config --add channels conda-forge

# create conda environment named "genomicsbench"
conda create -n genomicsbench -c bioconda clair python==3.6.8
conda activate genomicsbench
conda install deepdish

pip install --upgrade pip
pip install -r requirements.txt
pypy3 -m ensurepip
pypy3 -m pip install --no-cache-dir intervaltree==3.0.2

Compile

Note that the benchmarks have only been tested on gcc/g++-9 because of the dependency of related kernels. If there are multiple gcc/g++ versions, please refer to update-alternative to configure gcc/g++-9.

CPU benchmarks
- MKLROOT and MKL_IOMPS_DIR variables need to be set in Makefile to run grm. If you don't want to run grm, please comment grm related commands in Makefile
- VTUNE_HOME variable needs to be set if you want to run any VTune based analyses

make -j<num_threads>

GPU benchmarks
- Set CUDA_LIB=/usr/local/cuda or to the path of the local CUDA installation in Makefile.
- Also ensure environment variables PATH and LD_LIBRARY_PATH include the path to CUDA binaries and libraries.

make -j<num_threads> gpu

Running

CPU benchmarks

cd scripts
chmod +x ./run_cpu.sh
./run_cpu.sh <path to input dataset folder> <input size to run: small | large>

GPU benchmarks

cd scripts
chmod +x ./run_gpu.sh
./run_gpu.sh <path to input dataset folder> <input size to run: small | large>

Citation

If you use GenomicsBench or find GenomicsBench useful, please cite this work:

Arun Subramaniyan, Yufeng Gu, Timothy Dunn, Somnath Paul, Md. Vasimuddin, Sanchit Misra, David Blaauw, Satish Narayanasamy, Reetuparna Das. GenomicsBench: A Benchmark Suite for Genomics, In IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS), 2021 (to appear)

@inproceedings{genomicsbench,
    title={GenomicsBench: A Benchmark Suite for Genomics}},
    author={Subramaniyan, Arun and Gu, Yufeng and Dunn, Timothy and Paul, Somnath and Vasimuddin, Md. and Misra, Sanchit and Blaauw, David and Narayanasamy, Satish and Das, Reetuparna},
    booktitle={Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS)},
    year={2021}
}

Issues and bug reporting

GenomicsBench is under active development and we appreciate any feedback and suggestions from the community. Feel free to raise an issue or submit a pull request on Github. For assistance in using GenomicsBench, please contact: Arun Subramaniyan (arunsub@umich.edu), Yufeng Gu (yufenggu@umich.edu), Timothy Dunn (timdunn@umich.edu)

Licensing

Each benchmark is individually licensed according to the tool it is extracted from.

Acknowledgement

This work was supported in part by Precision Health at the University of Michigan, by the Kahn foundation, by the NSF under the CAREER-1652294 award and the Applications Driving Architectures (ADA) Research Center, a JUMP Center co-sponsored by SRC and DARPA.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

About

Download

Prerequisites

Python setup (optional: only needed for GPU benchmarks)

Compile

Running

Citation

Issues and bug reporting

Licensing

Acknowledgement

Files

README.md

Latest commit

History

README.md

File metadata and controls

About

Download

Prerequisites

Python setup (optional: only needed for GPU benchmarks)

Compile

Running

Citation

Issues and bug reporting

Licensing

Acknowledgement