Tumor Mutational Burden (TMB) Project

Overview

Tumor Mutational Burden (TMB) - the total number of mutations (changes) found in the DNA of cancer cells. TMB varies from different sequencing platforms to different type of cancer. Hence, it is important to harmonize variant selection methods and choose reasonable thresholds for the alternative allele count and the read depth. Accurate estimate of the total number of mutations in the targeted regions may help doctors plan the best treatment for each individual patient.

We present a nextflow workflow that calculates tumour mutational burden based on the criteria suggested by the TMB Harmonization Consortium.

Containers

The containers directory contains instructions and recipes for building Singularity containers used in the pipeline. Singularity containers are used on the Yale McCleary HPC cluster. The current pipeline configuration for McCleary uses .simg files stored in a shared location on the file system.

Directory structure

TMB-estimation/
├── bin
│   └── calculate_tmb.R
├── modules
│   ├── bcftools
│   │   └── norm
│   │       ├── main.nf
│   │       └── meta.yml
│   ├── bwa
│   │   ├── index
│   │   │   ├── main.nf
│   │   │   └── meta.yml
│   │   └── mem
│   │       ├── main.nf
│   │       └── meta.yml
│   ...
│   ├── vcf2maf
│   │   ├── main.nf
│   │   └── meta.yml
│   └── vep
│       ├── main.nf
│       └── meta.yml
└── subworkflows
    ├── fastq/trim/fastqc
    │   ├── main.nf
    │   └── meta.yml
    ├── index genome
    │   ├── main.nf
    │   ├── meta.yml
    │   └── nextflow.config
    ├── alignment
    │   ├── main.nf
    │   ├── meta.yml
    │   └── nextflow.config
    ├── mutect2/strelka2
    │   ├── main.nf
    │   ├── meta.yml
    │   └── nextflow.config
    ├── tmb calibration
    │   ├── main.nf
    │   ├── meta.yml
    │   └── nextflow.config
├── nextflow.config
├── samples.json
├── README.md
├── main.nf

Set up and run a workflow

This repository should first be cloned from GitHub:

git clone https://github.com/

You will need to load JAVA runtime and install a copy of nextflow in your work environment on McCleary prior to running the workflow.

module load ANTLR/2.7.7-GCCcore-12.2.0-Java-11
curl -s https://get.nextflow.io | bash
chmod +x nextflow

The input sample JSON file should have at least five columns: specimen_id, patient_id, tissue (tumor source site), purity (estimated tumor cell percentages in tissue samples), and read1/2 (paired end raw reads).

[
    {
        "specimen_id": "CPCT1",
        "patient_id":"CPCT1",
        "tissue": "other",
        "purity": 100,
        "read1":"~/workspace/dsl2/data/CPCT12345678R/CPCT1/CPCT12345678R_AHHKYHDSXX_S13_L001_R1_001.fastq.gz",
        "read2":"~/workspace/dsl2/data/CPCT12345678R/CPCT1/CPCT12345678R_AHHKYHDSXX_S13_L001_R2_001.fastq.gz"
    }
]

Finally, you can use the following commands to start an interactive session in the McCleary cluster and submit a job. It is recommended that you export the TMPDIR variable to somewhere other than the default directory on McCleary which is /tmp. Alternatively, you can use the -w option to set a temp directory outside your home directory for the large number of intermediate files produced by each process.

ssh [email protected]
salloc -p ycga -c 1 -t 00-3:00 --mem=8000

module load ANTLR/2.7.7-GCCcore-12.2.0-Java-11

nextflow run main.nf -profile hg19 \
--output_dir /SolidTumor/v2_0_validation \
--input_json samples.json \
-w /SolidTumor/v2_0_validation/tmp \
-bg

For multiple sample processing on HPC, you will need to use slurm job arrays to submit the workflow to avoid hitting the limit of job submission per hour

#!/bin/bash
#SBATCH -J CPCT1
#SBATCH --array=1-2%3
#SBATCH --output=CPCT1_%A_%a.out
#SBATCH --error=CPCT1_%A_%a.err
#SBATCH -p ycga
#SBATCH -t 8:0:0
#SBATCH -N 1
#SBATCH -c 6
#SBATCH --mem=32G

module load ANTLR/2.7.7-GCCcore-12.2.0-Java-11
source ~/.bashrc
nextflow run somatic_variant.nf --output_dir /SolidTumor/v2_0_validation --input_json samples.json \
-w /SolidTumor/v2_0_validation/tmp -profile hg19 -resume

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Tumor Mutational Burden (TMB) Project

Overview

Contents

Containers

Directory structure

Set up and run a workflow

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 123 Commits
bin		bin
d2		d2
modules		modules
subworkflows		subworkflows
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
alignment.nf		alignment.nf
main.nf		main.nf
nextflow.config		nextflow.config
samples.json		samples.json
somatic_variant.nf		somatic_variant.nf

License

khzhu/caliberTMB

Folders and files

Latest commit

History

Repository files navigation

Tumor Mutational Burden (TMB) Project

Overview

Contents

Containers

Directory structure

Set up and run a workflow

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages