diff --git a/.nojekyll b/.nojekyll new file mode 100644 index 0000000..e69de29 diff --git a/404.html b/404.html new file mode 100644 index 0000000..6069991 --- /dev/null +++ b/404.html @@ -0,0 +1 @@ +
The CARLISLE pipeline was developed in support of NIH Dr Vassiliki Saloura's Laboratory and Dr Javed Khan's Laboratory. It has been developed and tested solely on NIH HPC Biowulf.
The CARLISLE pipeline was developed in support of NIH Dr Vassiliki Saloura's Laboratory and Dr Javed Khan's Laboratory. It has been developed and tested solely on NIH HPC Biowulf.
"},{"location":"user-guide/contributions/","title":"Contributions","text":"The following members contributed to the development of the CARLISLE pipeline:
VK, SS, SK, HC contributed to the generating the source code and all members contributed to the main concepts and analysis.
"},{"location":"user-guide/getting-started/","title":"Overview","text":"The CARLISLE github repository is stored locally, and will be used for project deployment. Multiple projects can be deployed from this one point simultaneously, without concern.
"},{"location":"user-guide/getting-started/#1-getting-started","title":"1. Getting Started","text":""},{"location":"user-guide/getting-started/#11-introduction","title":"1.1 Introduction","text":"The CARLISLE Pipelie beings with raw FASTQ files and performs trimming followed by alignment using BOWTIE2. Data is then normalized through either the use of an user-species species (IE E.Coli) spike-in control or through the determined library size. Peaks are then called using MACS2, SEACR, and GoPEAKS with various options selected by the user. Peaks are then annotated, and summarized into reports. If designated, differential analysis is performed using DESEQ2. QC reports are also generated with each project using FASTQC and MULTIQC. Annotations are added using HOMER and ROSE. GSEA Enrichment analysis predictions are added using CHIPENRICH.
The following are sub-commands used within CARLISLE:
CARLISLE has several dependencies listed below. These dependencies can be installed by a sysadmin. All dependencies will be automatically loaded if running from Biowulf.
CARLISLE has been exclusively tested on Biowulf HPC. Login to the cluster's head node and move into the pipeline location.
# ssh into cluster's head node\nssh -Y $USER@biowulf.nih.gov\n
"},{"location":"user-guide/getting-started/#14-load-an-interactive-session","title":"1.4 Load an interactive session","text":"An interactive session should be started before performing any of the pipeline sub-commands, even if the pipeline is to be executed on the cluster.
# Grab an interactive node\nsinteractive --time=12:00:00 --mem=8gb --cpus-per-task=4 --pty bash\n
"},{"location":"user-guide/output/","title":"4. Expected Outputs","text":"The following directories are created under the WORKDIR/results directory:
run_go_enrichment
is set to true
in the config file.run_rose
is set to true
in the config file.\u251c\u2500\u2500 alignment_stats\n\u251c\u2500\u2500 bam\n\u251c\u2500\u2500 bedgraph\n\u251c\u2500\u2500 bigwig\n\u251c\u2500\u2500 fragments\n\u251c\u2500\u2500 peaks\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 0.05\n\u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u251c\u2500\u2500 contrasts\n\u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u251c\u2500\u2500 contrast_id1.dedup_status\n\u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2514\u2500\u2500 contrast_id2.dedup_status\n\u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u251c\u2500\u2500 gopeaks\n\u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u251c\u2500\u2500 annotation\n\u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u251c\u2500\u2500 go_enrichment\n\u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u251c\u2500\u2500 contrast_id1.dedup_status.go_enrichment_tables\n\u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2514\u2500\u2500 contrast_id2.dedup_status.go_enrichment_html_report\n\u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u251c\u2500\u2500 homer\n\u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u251c\u2500\u2500 replicate_id1_vs_control_id.dedup_status.gopeaks_broad.motifs\n\u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u251c\u2500\u2500 homerResults\n\u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2514\u2500\u2500 knownResults\n\u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u251c\u2500\u2500 replicate_id1_vs_control_id.dedup_status.gopeaks_narrow.motifs\n\u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u251c\u2500\u2500 homerResults\n\u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2514\u2500\u2500 knownResults\n\u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u251c\u2500\u2500 replicate_id2_vs_control_id.dedup_status.gopeaks_broad.motifs\n\u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u251c\u2500\u2500 homerResults\n\u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2514\u2500\u2500 knownResults\n\u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u251c\u2500\u2500 replicate_id2_vs_control_id.dedup_status.gopeaks_narrow.motifs\n\u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u251c\u2500\u2500 homerResults\n\u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2514\u2500\u2500 knownResults\n\u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2514\u2500\u2500 rose\n\u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u251c\u2500\u2500 replicate_id1_vs_control_id.dedup_status.gopeaks_broad.12500\n\u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u251c\u2500\u2500 replicate_id1_vs_control_id.dedup_status.gopeaks_narrow.12500\n\u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u251c\u2500\u2500 replicate_id2_vs_control_id.dedup_status.dedup.gopeaks_broad.12500\n\u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u251c\u2500\u2500 replicate_id2_vs_control_id.dedup_status.dedup.gopeaks_narrow.12500\n\u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2514\u2500\u2500 peak_output\n\u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u251c\u2500\u2500 macs2\n\u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u251c\u2500\u2500 annotation\n\u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u251c\u2500\u2500 go_enrichment\n\u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u251c\u2500\u2500 contrast_id1.dedup_status.go_enrichment_tables\n\u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2514\u2500\u2500 contrast_id2.dedup_status.go_enrichment_html_report\n\u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u251c\u2500\u2500 homer\n\u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u251c\u2500\u2500 replicate_id1_vs_control_id.dedup_status.macs2_narrow.motifs\n\u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u251c\u2500\u2500 homerResults\n\u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2514\u2500\u2500 knownResults\n\u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u251c\u2500\u2500 replicate_id1_vs_control_id.dedup_status.macs2_broad.motifs\n\u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u251c\u2500\u2500 homerResults\n\u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2514\u2500\u2500 knownResults\n\u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u251c\u2500\u2500 replicate_id2_vs_control_id.dedup_status.macs2_narrow.motifs\n\u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u251c\u2500\u2500 homerResults\n\u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2514\u2500\u2500 knownResults\n\u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u251c\u2500\u2500 replicate_id2_vs_control_id.dedup_status.macs2_broad.motifs\n\u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u251c\u2500\u2500 homerResults\n\u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2514\u2500\u2500 knownResults\n\u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2514\u2500\u2500 rose\n\u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u251c\u2500\u2500 replicate_id1_vs_control_id.dedup_status.macs2_broad.12500\n\u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u251c\u2500\u2500 replicate_id1_vs_control_id.dedup_status.macs2_narrow.12500\n\u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u251c\u2500\u2500 replicate_id2_vs_control_id.dedup_status.macs2_broad.12500\n\u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u251c\u2500\u2500 replicate_id2_vs_control_id.dedup_status.macs2_narrow.12500\n\u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2514\u2500\u2500 peak_output\n\u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2514\u2500\u2500 seacr\n\u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u251c\u2500\u2500 annotation\n\u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u251c\u2500\u2500 go_enrichment\n\u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u251c\u2500\u2500 contrast_id1.dedup_status.go_enrichment_tables\n\u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2514\u2500\u2500 contrast_id2.dedup_status.go_enrichment_html_report\n\u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u251c\u2500\u2500 homer\n\u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u251c\u2500\u2500 replicate_id1_vs_control_id.dedup_status.seacr_non_relaxed.motifs\n\u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u251c\u2500\u2500 homerResults\n\u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2514\u2500\u2500 knownResults\n\u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u251c\u2500\u2500 replicate_id1_vs_control_id.dedup_status.seacr_non_stringent.motifs\n\u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u251c\u2500\u2500 homerResults\n\u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2514\u2500\u2500 knownResults\n\u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u251c\u2500\u2500 replicate_id1_vs_control_id.dedup_status.seacr_norm_relaxed.motifs\n\u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u251c\u2500\u2500 homerResults\n\u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2514\u2500\u2500 knownResults\n\u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u251c\u2500\u2500 replicate_id1_vs_control_id.dedup_status.seacr_norm_stringent.motifs\n\u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u251c\u2500\u2500 homerResults\n\u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2514\u2500\u2500 knownResults\n\u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u251c\u2500\u2500 replicate_id2_vs_control_id.dedup_status.seacr_non_relaxed.motifs\n\u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u251c\u2500\u2500 homerResults\n\u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2514\u2500\u2500 knownResults\n\u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u251c\u2500\u2500 replicate_id2_vs_control_id.dedup_status.seacr_non_stringent.motifs\n\u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u251c\u2500\u2500 homerResults\n\u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2514\u2500\u2500 knownResults\n\u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u251c\u2500\u2500 replicate_id2_vs_control_id.dedup_status.seacr_norm_relaxed.motifs\n\u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u251c\u2500\u2500 homerResults\n\u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2514\u2500\u2500 knownResults\n\u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u251c\u2500\u2500 replicate_id2_vs_control_id.dedup_status.seacr_norm_stringent.motifs\n\u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u251c\u2500\u2500 homerResults\n\u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2514\u2500\u2500 knownResults\n\u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2514\u2500\u2500 rose\n\u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u251c\u2500\u2500 replicate_id1_vs_control_id.dedup_status.seacr_non_relaxed.12500\n\u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u251c\u2500\u2500 replicate_id1_vs_control_id.dedup_status.seacr_non_stringent.12500\n\u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u251c\u2500\u2500 replicate_id1_vs_control_id.dedup_status.seacr_norm_relaxed.12500\n\u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u251c\u2500\u2500 replicate_id1_vs_control_id.dedup_status.seacr_norm_stringent.12500\n\u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u251c\u2500\u2500 replicate_id2_vs_control_id.dedup_status.seacr_non_relaxed.12500\n\u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u251c\u2500\u2500 replicate_id2_vs_control_id.dedup_status.seacr_non_stringent.12500\n\u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u251c\u2500\u2500 replicate_id2_vs_control_id.dedup_status.seacr_norm_relaxed.12500\n\u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u251c\u2500\u2500 replicate_id2_vs_control_id.dedup_status.seacr_norm_stringent.12500\n\u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2514\u2500\u2500 peak_output\n\u2514\u2500\u2500 qc\n \u251c\u2500\u2500 fastqc_raw\n \u2514\u2500\u2500 fqscreen_raw\n
"},{"location":"user-guide/preparing-files/","title":"2. Preparing Files","text":"The pipeline is controlled through editing configuration and manifest files. Defaults are found in the /WORKDIR/config and /WORKDIR/manifest directories, after initialization.
"},{"location":"user-guide/preparing-files/#21-configs","title":"2.1 Configs","text":"The configuration files control parameters and software of the pipeline. These files are listed below:
The cluster configuration file dictates the resouces to be used during submission to Biowulf HPC. There are two differnt ways to control these parameters - first, to control the default settings, and second, to create or edit individual rules. These parameters should be edited with caution, after significant testing.
"},{"location":"user-guide/preparing-files/#212-tools-config","title":"2.1.2 Tools Config","text":"The tools configuration file dictates the version of each software or program that is being used in the pipeline.
"},{"location":"user-guide/preparing-files/#213-config-yaml","title":"2.1.3 Config YAML","text":"There are several groups of parameters that are editable for the user to control the various aspects of the pipeline. These are :
The pipeline allows for the use of a species specific spike-in control, or the use of normalization via library size. The parameter spikein_genome
should be set to the species term used in spikein_reference
.
For example for ecoli spike-in:
run_contrasts: true\nnorm_method: \"spikein\"\nspikein_genome: \"ecoli\"\nspikein_reference:\n ecoli:\n fa: \"PIPELINE_HOME/resources/spikein/Ecoli_GCF_000005845.2_ASM584v2_genomic.fna\"\n
For example for drosophila spike-in:
run_contrasts: true\nnorm_method: \"spikein\"\nspikein_genome: \"drosophila\"\nspikein_reference:\n drosophila:\n fa: \"/fdb/igenomes/Drosophila_melanogaster/UCSC/dm6/Sequence/WholeGenomeFasta/genome.fa\"\n
If it's determined that the amount of spike-in is not sufficient for the run, a library normaliaztion can be performed. 1. Complete a CARLISLE run with spike-in set to \"Y\". This will allow for the complete assessment of the spike-in. 2. Run inital QC analysis on the output data 3. Add the alignment_stats dir to the configuration file. 4. Re-run the CARLISLE pipeline
"},{"location":"user-guide/preparing-files/#21312-duplication-status","title":"2.1.3.1.2 Duplication Status","text":"Users can select duplicated peaks (dedup) or non-deduplicated peaks (no_dedup) through the user parameter.
dupstatus: \"dedup, no_dedup\" \n
"},{"location":"user-guide/preparing-files/#21313-peak-caller","title":"2.1.3.1.3 Peak Caller","text":"Three peak callers are available for deployment within the pipeline, with different settings deployed for each caller.
peaktype: \"macs2_narrow, macs2_broad,\"\n
peaktype: \"seacr_stringent, seacr_relaxed\"\n
peaktype: \"gopeaks_narrow, gopeaks_broad\"\n
A complete list of the available peak calling parameters and the recommended list of parameters is provided below:# Recommended list\n### peaktype: \"macs2_narrow, macs2_broad, gopeaks_narrow, gopeaks_broad\"\n\n# Available list\n### peaktype: \"macs2_narrow, macs2_broad, seacr_norm_stringent, seacr_norm_relaxed, seacr_non_stringent, seacr_non_relaxed, gopeaks_narrow, gopeaks_broad\"\n
"},{"location":"user-guide/preparing-files/#213131-macs2-additional-option","title":"2.1.3.1.3.1 Macs2 additional option","text":"MACS2 can be run with or without the control. adding a control will increase peak specificity Selecting \"Y\" for the macs2_control
will run the paired control sample provided in the sample manifest
Thresholds for quality can be controled through the quality_tresholds
parameter. This must be a list of comma separated values. minimum of numeric value required.
#default values\nquality_thresholds: \"0.1, 0.05, 0.01\"\n
Additional reference files may be added to the pipeline, if other species were to be used.
The absolute file paths which must be included are:
The following information must be included:
There are two manifests, one which required for all pipeliens and one that is only required if running a differential analysis. These files describe information on the samples and desired contrasts. The paths of these files are defined in the snakemake_config.yaml file. These files are:
This manifest will include information to sample level information. It includes the following column headers:
An example sampleManifest file is shown below:
sampleName replicateNumber isControl controlName controlReplicateNumber path_to_R1 path_to_R2 53_H3K4me3 1 N HN6_IgG_rabbit_negative_control 1 PIPELINE_HOME/.test/53_H3K4me3_1.R1.fastq.gz PIPELINE_HOME/.test/53_H3K4me3_1.R2.fastq.gz 53_H3K4me3 2 N HN6_IgG_rabbit_negative_control 1 PIPELINE_HOME/.test/53_H3K4me3_2.R1.fastq.gz PIPELINE_HOME/.test/53_H3K4me3_2.R2.fastq.gz HN6_H3K4me3 1 N HN6_IgG_rabbit_negative_control 1 PIPELINE_HOME/.test/HN6_H3K4me3_1.R1.fastq.gz PIPELINE_HOME/.test/HN6_H3K4me3_1.R2.fastq.gz HN6_H3K4me3 2 N HN6_IgG_rabbit_negative_control 1 PIPELINE_HOME/.test/HN6_H3K4me3_2.R1.fastq.gz PIPELINE_HOME/.test/HN6_H3K4me3_2.R2.fastq.gz HN6_IgG_rabbit_negative_control 1 Y - - PIPELINE_HOME/.test/HN6_IgG_rabbit_negative_control_1.R1.fastq.gz PIPELINE_HOME/.test/HN6_IgG_rabbit_negative_control_1.R2.fastq.gz"},{"location":"user-guide/preparing-files/#222-contrast-manifest-optional","title":"2.2.2 Contrast Manifest (OPTIONAL)","text":"This manifest will include sample information to performed differential comparisons.
An example contrast file:
condition1 condition2 MOC1_siSmyd3_2m_25_HCHO MOC1_siNC_2m_25_HCHONote: you must have more than one sample per condition in order to perform differential analysis with DESeq2
"},{"location":"user-guide/run/","title":"3. Running the Pipeline","text":""},{"location":"user-guide/run/#31-pipeline-overview","title":"3.1 Pipeline Overview","text":"The Snakemake workflow has a multiple options:
Usage: bash ./data/CCBR_Pipeliner/Pipelines/CARLISLE/carlisle -m/--runmode=<RUNMODE> -w/--workdir=<WORKDIR>\n1. RUNMODE: [Type: String] Valid options:\n *) init : initialize workdir\n *) run : run with slurm\n *) reset : DELETE workdir dir and re-init it\n *) dryrun : dry run snakemake to generate DAG\n *) unlock : unlock workdir if locked by snakemake\n *) runlocal : run without submitting to sbatch\n *) runtest: run on cluster with included test dataset\n2. WORKDIR: [Type: String]: Absolute or relative path to the output folder with write permissions.\n
"},{"location":"user-guide/run/#32-commands-explained","title":"3.2 Commands explained","text":"The following explains each of the command options:
To run any of these commands, follow the the syntax:
bash ./data/CCBR_Pipeliner/Pipelines/CARLISLE/carlisle --runmode=COMMAND --workdir=/path/to/output/dir\n
"},{"location":"user-guide/run/#33-typical-workflow","title":"3.3 Typical Workflow","text":"A typical command workflow, running on the cluser, is as follows:
bash ./data/CCBR_Pipeliner/Pipelines/CARLISLE/carlisle --runmode=init --workdir=/path/to/output/dir\n\nbash ./data/CCBR_Pipeliner/Pipelines/CARLISLE/carlisle --runmode=dryrun --workdir=/path/to/output/dir\n\nbash ./data/CCBR_Pipeliner/Pipelines/CARLISLE/carlisle --runmode=run --workdir=/path/to/output/dir\n
"},{"location":"user-guide/test-info/","title":"5. Pipeline Tutorial","text":"Welcome to the CARLISLE Pipeline Tutorial!
"},{"location":"user-guide/test-info/#51-getting-started","title":"5.1 Getting Started","text":"Review the information on the Getting Started for a complete overview the pipeline. The tutorial below will use test data available on NIH Biowulf HPC only. All example code will assume you are running v1.0 of the pipeline, using test data available on GitHub.
A. Change working directory to the CARLISLE repository
B. Initialize Pipeline
bash ./path/to/dir/carlisle --runmode=init --workdir=/path/to/output/dir\n
"},{"location":"user-guide/test-info/#52-submit-the-test-data","title":"5.2 Submit the test data","text":"Test data is included in the .test directory as well as the config directory.
A Run the test command to prepare the data, perform a dry-run and submit to the cluster
bash ./path/to/dir/carlisle --runmode=runtest --workdir=/path/to/output/dir\n
runtest
is as follows: Job stats:\njob count min threads max threads\n----------------------------- ------- ------------- -------------\nDESeq 24 1 1\nalign 9 56 56\nalignstats 9 2 2\nall 1 1 1\nbam2bg 9 4 4\ncreate_contrast_data_files 24 1 1\ncreate_contrast_peakcaller_files 12 1 1\ncreate_reference 1 32 32\ncreate_replicate_sample_table 1 1 1\ndiffbb 24 1 1\nfilter 18 2 2\nfindMotif 96 6 6\ngather_alignstats 1 1 1\ngo_enrichment 12 1 1\ngopeaks_broad 16 2 2\ngopeaks_narrow 16 2 2\nmacs2_broad 16 2 2\nmacs2_narrow 16 2 2\nmake_counts_matrix 24 1 1\nmultiqc 2 1 1\nqc_fastqc 9 1 1\nrose 96 2 2\nseacr_relaxed 16 2 2\nseacr_stringent 16 2 2\nspikein_assessment 1 1 1\ntrim 9 56 56\ntotal 478 1 56\n
Review the expected outputs on the Output page. If there are errors, review and performing stesp described on the Troubleshooting page as needed.
"},{"location":"user-guide/troubleshooting/","title":"Troubleshooting","text":"Recommended steps to troubleshoot the pipeline.
"},{"location":"user-guide/troubleshooting/#11-email","title":"1.1 Email","text":"Check your email for an email regarding pipeline failure. You will receive an email from slurm@biowulf.nih.gov with the subject: Slurm Job_id=[#] Name=CARLISLE Failed, Run time [time], FAILED, ExitCode 1
"},{"location":"user-guide/troubleshooting/#12-review-the-log-files","title":"1.2 Review the log files","text":"Review the logs in two ways:
/path/to/results/dir/
and titled slurm-[jobid].out
. Reviewing this file will tell you what rule errored, and for any local SLURM jobs, provide error details/path/to/results/dir/logs/
. Each rule will include a .err
and .out
file, with the following formatting: {rulename}.{masterjobID}.{individualruleID}.{wildcards from the rule}.{out or err}
After addressing the issue, unlock the output directory, perform another dry-run and check the status of the pipeline, then resubmit to the cluster.
#unlock dir\nbash ./data/CCBR_Pipeliner/Pipelines/CARLISLE/carlisle --runmode=unlock --workdir=/path/to/output/dir\n\n#perform dry-run\nbash ./data/CCBR_Pipeliner/Pipelines/CARLISLE/carlisle --runmode=dryrun --workdir=/path/to/output/dir\n\n#submit to cluster\nbash ./data/CCBR_Pipeliner/Pipelines/CARLISLE/carlisle --runmode=run --workdir=/path/to/output/dir\n
"},{"location":"user-guide/troubleshooting/#14-contact-information","title":"1.4 Contact information","text":"If after troubleshooting, the error cannot be resolved, or if a bug is found, please create an issue and send and email to Samantha Chill.
"}]} \ No newline at end of file diff --git a/sitemap.xml b/sitemap.xml new file mode 100644 index 0000000..0f8724e --- /dev/null +++ b/sitemap.xml @@ -0,0 +1,3 @@ + +The following members contributed to the development of the CARLISLE pipeline:
VK, SS, SK, HC contributed to the generating the source code and all members contributed to the main concepts and analysis.
The CARLISLE github repository is stored locally, and will be used for project deployment. Multiple projects can be deployed from this one point simultaneously, without concern.
The CARLISLE Pipelie beings with raw FASTQ files and performs trimming followed by alignment using BOWTIE2. Data is then normalized through either the use of an user-species species (IE E.Coli) spike-in control or through the determined library size. Peaks are then called using MACS2, SEACR, and GoPEAKS with various options selected by the user. Peaks are then annotated, and summarized into reports. If designated, differential analysis is performed using DESEQ2. QC reports are also generated with each project using FASTQC and MULTIQC. Annotations are added using HOMER and ROSE. GSEA Enrichment analysis predictions are added using CHIPENRICH.
The following are sub-commands used within CARLISLE:
CARLISLE has several dependencies listed below. These dependencies can be installed by a sysadmin. All dependencies will be automatically loaded if running from Biowulf.
CARLISLE has been exclusively tested on Biowulf HPC. Login to the cluster's head node and move into the pipeline location.
# ssh into cluster's head node
+ssh -Y $USER@biowulf.nih.gov
+
An interactive session should be started before performing any of the pipeline sub-commands, even if the pipeline is to be executed on the cluster.
# Grab an interactive node
+sinteractive --time=12:00:00 --mem=8gb --cpus-per-task=4 --pty bash
+
The following directories are created under the WORKDIR/results directory:
run_go_enrichment
is set to true
in the config file.run_rose
is set to true
in the config file.├── alignment_stats
+├── bam
+├── bedgraph
+├── bigwig
+├── fragments
+├── peaks
+│ ├── 0.05
+│ │ ├── contrasts
+│ │ │ ├── contrast_id1.dedup_status
+│ │ │ └── contrast_id2.dedup_status
+│ │ ├── gopeaks
+│ │ │ ├── annotation
+│ │ │ │ ├── go_enrichment
+│ │ │ │ │ ├── contrast_id1.dedup_status.go_enrichment_tables
+│ │ │ │ │ └── contrast_id2.dedup_status.go_enrichment_html_report
+│ │ │ │ ├── homer
+│ │ │ │ │ ├── replicate_id1_vs_control_id.dedup_status.gopeaks_broad.motifs
+│ │ │ │ │ │ ├── homerResults
+│ │ │ │ │ │ └── knownResults
+│ │ │ │ │ ├── replicate_id1_vs_control_id.dedup_status.gopeaks_narrow.motifs
+│ │ │ │ │ │ ├── homerResults
+│ │ │ │ │ │ └── knownResults
+│ │ │ │ │ ├── replicate_id2_vs_control_id.dedup_status.gopeaks_broad.motifs
+│ │ │ │ │ │ ├── homerResults
+│ │ │ │ │ │ └── knownResults
+│ │ │ │ │ ├── replicate_id2_vs_control_id.dedup_status.gopeaks_narrow.motifs
+│ │ │ │ │ │ ├── homerResults
+│ │ │ │ │ │ └── knownResults
+│ │ │ │ └── rose
+│ │ │ │ ├── replicate_id1_vs_control_id.dedup_status.gopeaks_broad.12500
+│ │ │ │ ├── replicate_id1_vs_control_id.dedup_status.gopeaks_narrow.12500
+│ │ │ │ ├── replicate_id2_vs_control_id.dedup_status.dedup.gopeaks_broad.12500
+│ │ │ │ ├── replicate_id2_vs_control_id.dedup_status.dedup.gopeaks_narrow.12500
+│ │ │ └── peak_output
+│ │ ├── macs2
+│ │ │ ├── annotation
+│ │ │ │ ├── go_enrichment
+│ │ │ │ │ ├── contrast_id1.dedup_status.go_enrichment_tables
+│ │ │ │ │ └── contrast_id2.dedup_status.go_enrichment_html_report
+│ │ │ │ ├── homer
+│ │ │ │ │ ├── replicate_id1_vs_control_id.dedup_status.macs2_narrow.motifs
+│ │ │ │ │ │ ├── homerResults
+│ │ │ │ │ │ └── knownResults
+│ │ │ │ │ ├── replicate_id1_vs_control_id.dedup_status.macs2_broad.motifs
+│ │ │ │ │ │ ├── homerResults
+│ │ │ │ │ │ └── knownResults
+│ │ │ │ │ ├── replicate_id2_vs_control_id.dedup_status.macs2_narrow.motifs
+│ │ │ │ │ │ ├── homerResults
+│ │ │ │ │ │ └── knownResults
+│ │ │ │ │ ├── replicate_id2_vs_control_id.dedup_status.macs2_broad.motifs
+│ │ │ │ │ │ ├── homerResults
+│ │ │ │ │ │ └── knownResults
+│ │ │ │ └── rose
+│ │ │ │ ├── replicate_id1_vs_control_id.dedup_status.macs2_broad.12500
+│ │ │ │ ├── replicate_id1_vs_control_id.dedup_status.macs2_narrow.12500
+│ │ │ │ ├── replicate_id2_vs_control_id.dedup_status.macs2_broad.12500
+│ │ │ │ ├── replicate_id2_vs_control_id.dedup_status.macs2_narrow.12500
+│ │ │ └── peak_output
+│ │ └── seacr
+│ │ │ ├── annotation
+│ │ │ │ ├── go_enrichment
+│ │ │ │ │ ├── contrast_id1.dedup_status.go_enrichment_tables
+│ │ │ │ │ └── contrast_id2.dedup_status.go_enrichment_html_report
+│ │ │ │ ├── homer
+│ │ │ │ │ ├── replicate_id1_vs_control_id.dedup_status.seacr_non_relaxed.motifs
+│ │ │ │ │ │ ├── homerResults
+│ │ │ │ │ │ └── knownResults
+│ │ │ │ │ ├── replicate_id1_vs_control_id.dedup_status.seacr_non_stringent.motifs
+│ │ │ │ │ │ ├── homerResults
+│ │ │ │ │ │ └── knownResults
+│ │ │ │ │ ├── replicate_id1_vs_control_id.dedup_status.seacr_norm_relaxed.motifs
+│ │ │ │ │ │ ├── homerResults
+│ │ │ │ │ │ └── knownResults
+│ │ │ │ │ ├── replicate_id1_vs_control_id.dedup_status.seacr_norm_stringent.motifs
+│ │ │ │ │ │ ├── homerResults
+│ │ │ │ │ │ └── knownResults
+│ │ │ │ │ ├── replicate_id2_vs_control_id.dedup_status.seacr_non_relaxed.motifs
+│ │ │ │ │ │ ├── homerResults
+│ │ │ │ │ │ └── knownResults
+│ │ │ │ │ ├── replicate_id2_vs_control_id.dedup_status.seacr_non_stringent.motifs
+│ │ │ │ │ │ ├── homerResults
+│ │ │ │ │ │ └── knownResults
+│ │ │ │ │ ├── replicate_id2_vs_control_id.dedup_status.seacr_norm_relaxed.motifs
+│ │ │ │ │ │ ├── homerResults
+│ │ │ │ │ │ └── knownResults
+│ │ │ │ │ ├── replicate_id2_vs_control_id.dedup_status.seacr_norm_stringent.motifs
+│ │ │ │ │ │ ├── homerResults
+│ │ │ │ │ │ └── knownResults
+│ │ │ │ └── rose
+│ │ │ │ ├── replicate_id1_vs_control_id.dedup_status.seacr_non_relaxed.12500
+│ │ │ │ ├── replicate_id1_vs_control_id.dedup_status.seacr_non_stringent.12500
+│ │ │ │ ├── replicate_id1_vs_control_id.dedup_status.seacr_norm_relaxed.12500
+│ │ │ │ ├── replicate_id1_vs_control_id.dedup_status.seacr_norm_stringent.12500
+│ │ │ │ ├── replicate_id2_vs_control_id.dedup_status.seacr_non_relaxed.12500
+│ │ │ │ ├── replicate_id2_vs_control_id.dedup_status.seacr_non_stringent.12500
+│ │ │ │ ├── replicate_id2_vs_control_id.dedup_status.seacr_norm_relaxed.12500
+│ │ │ │ ├── replicate_id2_vs_control_id.dedup_status.seacr_norm_stringent.12500
+│ │ └── peak_output
+└── qc
+ ├── fastqc_raw
+ └── fqscreen_raw
+
The pipeline is controlled through editing configuration and manifest files. Defaults are found in the /WORKDIR/config and /WORKDIR/manifest directories, after initialization.
The configuration files control parameters and software of the pipeline. These files are listed below:
The cluster configuration file dictates the resouces to be used during submission to Biowulf HPC. There are two differnt ways to control these parameters - first, to control the default settings, and second, to create or edit individual rules. These parameters should be edited with caution, after significant testing.
The tools configuration file dictates the version of each software or program that is being used in the pipeline.
There are several groups of parameters that are editable for the user to control the various aspects of the pipeline. These are :
The pipeline allows for the use of a species specific spike-in control, or the use of normalization via library size. The parameter spikein_genome
should be set to the species term used in spikein_reference
.
For example for ecoli spike-in:
run_contrasts: true
+norm_method: "spikein"
+spikein_genome: "ecoli"
+spikein_reference:
+ ecoli:
+ fa: "PIPELINE_HOME/resources/spikein/Ecoli_GCF_000005845.2_ASM584v2_genomic.fna"
+
For example for drosophila spike-in:
run_contrasts: true
+norm_method: "spikein"
+spikein_genome: "drosophila"
+spikein_reference:
+ drosophila:
+ fa: "/fdb/igenomes/Drosophila_melanogaster/UCSC/dm6/Sequence/WholeGenomeFasta/genome.fa"
+
If it's determined that the amount of spike-in is not sufficient for the run, a library normaliaztion can be performed. 1. Complete a CARLISLE run with spike-in set to "Y". This will allow for the complete assessment of the spike-in. 2. Run inital QC analysis on the output data 3. Add the alignment_stats dir to the configuration file. 4. Re-run the CARLISLE pipeline
Users can select duplicated peaks (dedup) or non-deduplicated peaks (no_dedup) through the user parameter.
dupstatus: "dedup, no_dedup"
+
Three peak callers are available for deployment within the pipeline, with different settings deployed for each caller.
peaktype: "macs2_narrow, macs2_broad,"
+
peaktype: "seacr_stringent, seacr_relaxed"
+
peaktype: "gopeaks_narrow, gopeaks_broad"
+
Peak Caller | Narrow | Broad | Normalized, Stringent | Normalized, Relaxed | Non-Normalized, Stringent | Non-Normalized, Relaxed |
---|---|---|---|---|---|---|
Macs2 | AVAIL | AVAIL | NA | NA | NA | NA |
SEACR | NA | NA | AVAIL w/o SPIKEIN | AVAIL w/o SPIKEIN | AVAIL w/ SPIKEIN | AVAIL w/ SPIKEIN |
GoPeaks | AVAIL | AVAIL | NA | NA | NA | NA |
# Recommended list
+### peaktype: "macs2_narrow, macs2_broad, gopeaks_narrow, gopeaks_broad"
+
+# Available list
+### peaktype: "macs2_narrow, macs2_broad, seacr_norm_stringent, seacr_norm_relaxed, seacr_non_stringent, seacr_non_relaxed, gopeaks_narrow, gopeaks_broad"
+
MACS2 can be run with or without the control. adding a control will increase peak specificity Selecting "Y" for the macs2_control
will run the paired control sample provided in the sample manifest
Thresholds for quality can be controled through the quality_tresholds
parameter. This must be a list of comma separated values. minimum of numeric value required.
#default values
+quality_thresholds: "0.1, 0.05, 0.01"
+
Additional reference files may be added to the pipeline, if other species were to be used.
The absolute file paths which must be included are:
The following information must be included:
There are two manifests, one which required for all pipeliens and one that is only required if running a differential analysis. These files describe information on the samples and desired contrasts. The paths of these files are defined in the snakemake_config.yaml file. These files are:
This manifest will include information to sample level information. It includes the following column headers:
An example sampleManifest file is shown below:
sampleName | replicateNumber | isControl | controlName | controlReplicateNumber | path_to_R1 | path_to_R2 |
---|---|---|---|---|---|---|
53_H3K4me3 | 1 | N | HN6_IgG_rabbit_negative_control | 1 | PIPELINE_HOME/.test/53_H3K4me3_1.R1.fastq.gz | PIPELINE_HOME/.test/53_H3K4me3_1.R2.fastq.gz |
53_H3K4me3 | 2 | N | HN6_IgG_rabbit_negative_control | 1 | PIPELINE_HOME/.test/53_H3K4me3_2.R1.fastq.gz | PIPELINE_HOME/.test/53_H3K4me3_2.R2.fastq.gz |
HN6_H3K4me3 | 1 | N | HN6_IgG_rabbit_negative_control | 1 | PIPELINE_HOME/.test/HN6_H3K4me3_1.R1.fastq.gz | PIPELINE_HOME/.test/HN6_H3K4me3_1.R2.fastq.gz |
HN6_H3K4me3 | 2 | N | HN6_IgG_rabbit_negative_control | 1 | PIPELINE_HOME/.test/HN6_H3K4me3_2.R1.fastq.gz | PIPELINE_HOME/.test/HN6_H3K4me3_2.R2.fastq.gz |
HN6_IgG_rabbit_negative_control | 1 | Y | - | - | PIPELINE_HOME/.test/HN6_IgG_rabbit_negative_control_1.R1.fastq.gz | PIPELINE_HOME/.test/HN6_IgG_rabbit_negative_control_1.R2.fastq.gz |
This manifest will include sample information to performed differential comparisons.
An example contrast file:
condition1 | condition2 |
---|---|
MOC1_siSmyd3_2m_25_HCHO | MOC1_siNC_2m_25_HCHO |
Note: you must have more than one sample per condition in order to perform differential analysis with DESeq2
The Snakemake workflow has a multiple options:
Usage: bash ./data/CCBR_Pipeliner/Pipelines/CARLISLE/carlisle -m/--runmode=<RUNMODE> -w/--workdir=<WORKDIR>
+1. RUNMODE: [Type: String] Valid options:
+ *) init : initialize workdir
+ *) run : run with slurm
+ *) reset : DELETE workdir dir and re-init it
+ *) dryrun : dry run snakemake to generate DAG
+ *) unlock : unlock workdir if locked by snakemake
+ *) runlocal : run without submitting to sbatch
+ *) runtest: run on cluster with included test dataset
+2. WORKDIR: [Type: String]: Absolute or relative path to the output folder with write permissions.
+
The following explains each of the command options:
To run any of these commands, follow the the syntax:
bash ./data/CCBR_Pipeliner/Pipelines/CARLISLE/carlisle --runmode=COMMAND --workdir=/path/to/output/dir
+
A typical command workflow, running on the cluser, is as follows:
bash ./data/CCBR_Pipeliner/Pipelines/CARLISLE/carlisle --runmode=init --workdir=/path/to/output/dir
+
+bash ./data/CCBR_Pipeliner/Pipelines/CARLISLE/carlisle --runmode=dryrun --workdir=/path/to/output/dir
+
+bash ./data/CCBR_Pipeliner/Pipelines/CARLISLE/carlisle --runmode=run --workdir=/path/to/output/dir
+
Welcome to the CARLISLE Pipeline Tutorial!
Review the information on the Getting Started for a complete overview the pipeline. The tutorial below will use test data available on NIH Biowulf HPC only. All example code will assume you are running v1.0 of the pipeline, using test data available on GitHub.
A. Change working directory to the CARLISLE repository
B. Initialize Pipeline
bash ./path/to/dir/carlisle --runmode=init --workdir=/path/to/output/dir
+
Test data is included in the .test directory as well as the config directory.
A Run the test command to prepare the data, perform a dry-run and submit to the cluster
bash ./path/to/dir/carlisle --runmode=runtest --workdir=/path/to/output/dir
+
runtest
is as follows: Job stats:
+job count min threads max threads
+----------------------------- ------- ------------- -------------
+DESeq 24 1 1
+align 9 56 56
+alignstats 9 2 2
+all 1 1 1
+bam2bg 9 4 4
+create_contrast_data_files 24 1 1
+create_contrast_peakcaller_files 12 1 1
+create_reference 1 32 32
+create_replicate_sample_table 1 1 1
+diffbb 24 1 1
+filter 18 2 2
+findMotif 96 6 6
+gather_alignstats 1 1 1
+go_enrichment 12 1 1
+gopeaks_broad 16 2 2
+gopeaks_narrow 16 2 2
+macs2_broad 16 2 2
+macs2_narrow 16 2 2
+make_counts_matrix 24 1 1
+multiqc 2 1 1
+qc_fastqc 9 1 1
+rose 96 2 2
+seacr_relaxed 16 2 2
+seacr_stringent 16 2 2
+spikein_assessment 1 1 1
+trim 9 56 56
+total 478 1 56
+
Review the expected outputs on the Output page. If there are errors, review and performing stesp described on the Troubleshooting page as needed.
Recommended steps to troubleshoot the pipeline.
Check your email for an email regarding pipeline failure. You will receive an email from slurm@biowulf.nih.gov with the subject: Slurm Job_id=[#] Name=CARLISLE Failed, Run time [time], FAILED, ExitCode 1
Review the logs in two ways:
/path/to/results/dir/
and titled slurm-[jobid].out
. Reviewing this file will tell you what rule errored, and for any local SLURM jobs, provide error details/path/to/results/dir/logs/
. Each rule will include a .err
and .out
file, with the following formatting: {rulename}.{masterjobID}.{individualruleID}.{wildcards from the rule}.{out or err}
After addressing the issue, unlock the output directory, perform another dry-run and check the status of the pipeline, then resubmit to the cluster.
#unlock dir
+bash ./data/CCBR_Pipeliner/Pipelines/CARLISLE/carlisle --runmode=unlock --workdir=/path/to/output/dir
+
+#perform dry-run
+bash ./data/CCBR_Pipeliner/Pipelines/CARLISLE/carlisle --runmode=dryrun --workdir=/path/to/output/dir
+
+#submit to cluster
+bash ./data/CCBR_Pipeliner/Pipelines/CARLISLE/carlisle --runmode=run --workdir=/path/to/output/dir
+
If after troubleshooting, the error cannot be resolved, or if a bug is found, please create an issue and send and email to Samantha Chill.