Skip to content

Commit

Permalink
Merge pull request #290 from JoseEspinosa/morefixes
Browse files Browse the repository at this point in the history
Change chromap arguments and more fixes
  • Loading branch information
JoseEspinosa authored Aug 9, 2022
2 parents 9582d07 + b87af07 commit 9e39ea2
Show file tree
Hide file tree
Showing 9 changed files with 174 additions and 72 deletions.
2 changes: 1 addition & 1 deletion .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -76,7 +76,7 @@ jobs:
matrix:
aligner:
- "bowtie2"
# - "chromap"
- "chromap"
- "star"
steps:
- name: Check out pipeline code
Expand Down
11 changes: 8 additions & 3 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,10 +3,12 @@
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/)
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

## [Unpublished Version / DEV]
## [1.2.2] - 2022-08-22

### Enhancements & fixes

- Pipeline has been re-implemented in [Nextflow DSL2](https://www.nextflow.io/docs/latest/dsl2.html)
- All software containers are now exclusively obtained from [Biocontainers](https://biocontainers.pro/#/registry)
- Updated pipeline template to [nf-core/tools 2.4.1](https://github.com/nf-core/tools/releases/tag/2.4.1)
- [[#128](https://github.com/nf-core/chipseq/issues/128)] - Filter files with no peaks to avoid errors in downstream processes
- [[#220](https://github.com/nf-core/chipseq/issues/220)] - Fix `phantompeakqualtools` protection stack overflow error
Expand All @@ -20,9 +22,12 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
- [[228](https://github.com/nf-core/chipseq/issues/228)] - Update blacklist bed files.
- [nf-core/tools#1415](https://github.com/nf-core/tools/issues/1415) - Make `--outdir` a mandatory parameter
- [[282](https://github.com/nf-core/chipseq/issues/282)] - Fix `genome.fa` publication for IGV.
- [[280](https://github.com/nf-core/chipseq/issues/280)] - Update `macs_gsize` in `igenomes.config`, create a new `--read_length` parameter and implement the logic to calculate `--macs_gsize` when the parameter is missing.
- Eliminate `if`s conditions from `deseq2_qc` and `macs2_consensus` {local module and use `ext.when` instead.
- [[280](https://github.com/nf-core/chipseq/issues/280)] - Update `macs_gsize` in `igenomes.config`, create a new `--read_length` parameter and implement the logic to calculate `--macs_gsize` when the parameter is missing
- Eliminate `if`s conditions from `deseq2_qc` and `macs2_consensus` (local module and use `ext.when` instead)
- Remove `deseq2` differential binding analysis of consensus peaks.
- Filter paired-end files produced by `chromap` due to [this](https://github.com/nf-core/chipseq/issues/291) issue
- Remove <ANTIBODY> from the macs2 consensus publish directory since it can not be referred as input from the IGV process (meta.id not resolved at execution time)
- Add bytesize link to readme.

### Parameters

Expand Down
24 changes: 16 additions & 8 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,18 +23,28 @@ On release, automated continuous integration tests run the pipeline on a [full-s

The pipeline is built using [Nextflow](https://www.nextflow.io), a workflow tool to run tasks across multiple compute infrastructures in a very portable manner. It uses Docker/Singularity containers making installation trivial and results highly reproducible. The [Nextflow DSL2](https://www.nextflow.io/docs/latest/dsl2.html) implementation of this pipeline uses one container per process which makes it much easier to maintain and update software dependencies. Where possible, these processes have been submitted to and installed from [nf-core/modules](https://github.com/nf-core/modules) in order to make them available to all nf-core pipelines, and to everyone within the Nextflow community!

## Online videos

A short talk about the history, current status and functionality on offer in this pipeline was given by Jose Espinosa-Carrasco ([@joseespinosa](https://github.com/joseespinosa)) on [26th July 2022](https://nf-co.re/events/2022/bytesize-chipseq) as part of the nf-core/bytesize series.

You can find numerous talks on the [nf-core events page](https://nf-co.re/events) from various topics including writing pipelines/modules in Nextflow DSL2, using nf-core tooling, running nf-core pipelines as well as more generic content like contributing to Github. Please check them out!

## Pipeline summary

1. Raw read QC ([`FastQC`](https://www.bioinformatics.babraham.ac.uk/projects/fastqc/))
2. Adapter trimming ([`Trim Galore!`](https://www.bioinformatics.babraham.ac.uk/projects/trim_galore/))
3. Alignment ([`BWA`](https://sourceforge.net/projects/bio-bwa/files/))
3. Choice of multiple aligners
1.([`BWA`](https://sourceforge.net/projects/bio-bwa/files/))
2.([`Chromap`](https://github.com/haowenz/chromap)). **For paired-end reads only working until mapping steps, see [here](https://github.com/nf-core/chipseq/issues/291)**
3.([`Bowtie2`](http://bowtie-bio.sourceforge.net/bowtie2/index.shtml))
4.([`STAR`](https://github.com/alexdobin/STAR))
4. Mark duplicates ([`picard`](https://broadinstitute.github.io/picard/))
5. Merge alignments from multiple libraries of the same sample ([`picard`](https://broadinstitute.github.io/picard/))
1. Re-mark duplicates ([`picard`](https://broadinstitute.github.io/picard/))
2. Filtering to remove:
- reads mapping to blacklisted regions ([`SAMtools`](https://sourceforge.net/projects/samtools/files/samtools/), [`BEDTools`](https://github.com/arq5x/bedtools2/))
- reads that are marked as duplicates ([`SAMtools`](https://sourceforge.net/projects/samtools/files/samtools/))
- reads that arent marked as primary alignments ([`SAMtools`](https://sourceforge.net/projects/samtools/files/samtools/))
- reads that are not marked as primary alignments ([`SAMtools`](https://sourceforge.net/projects/samtools/files/samtools/))
- reads that are unmapped ([`SAMtools`](https://sourceforge.net/projects/samtools/files/samtools/))
- reads that map to multiple locations ([`SAMtools`](https://sourceforge.net/projects/samtools/files/samtools/))
- reads containing > 4 mismatches ([`BAMTools`](https://github.com/pezmaster31/bamtools))
Expand All @@ -47,11 +57,11 @@ The pipeline is built using [Nextflow](https://www.nextflow.io), a workflow tool
5. Generate gene-body meta-profile from bigWig files ([`deepTools`](https://deeptools.readthedocs.io/en/develop/content/tools/plotProfile.html))
6. Calculate genome-wide IP enrichment relative to control ([`deepTools`](https://deeptools.readthedocs.io/en/develop/content/tools/plotFingerprint.html))
7. Calculate strand cross-correlation peak and ChIP-seq quality measures including NSC and RSC ([`phantompeakqualtools`](https://github.com/kundajelab/phantompeakqualtools))
8. Call broad/narrow peaks ([`MACS2`](https://github.com/taoliu/MACS))
8. Call broad/narrow peaks ([`MACS2`](https://github.com/macs3-project/MACS))
9. Annotate peaks relative to gene features ([`HOMER`](http://homer.ucsd.edu/homer/download.html))
10. Create consensus peakset across all samples and create tabular file to aid in the filtering of the data ([`BEDTools`](https://github.com/arq5x/bedtools2/))
11. Count reads in consensus peaks ([`featureCounts`](http://bioinf.wehi.edu.au/featureCounts/))
12. Differential binding analysis, PCA and clustering ([`R`](https://www.r-project.org/), [`DESeq2`](https://bioconductor.org/packages/release/bioc/html/DESeq2.html))
12. PCA and clustering ([`R`](https://www.r-project.org/), [`DESeq2`](https://bioconductor.org/packages/release/bioc/html/DESeq2.html))
6. Create IGV session file containing bigWig tracks, peaks and differential sites for data visualisation ([`IGV`](https://software.broadinstitute.org/software/igv/)).
7. Present QC for raw read, alignment, peak-calling and differential binding results ([`MultiQC`](http://multiqc.info/), [`R`](https://www.r-project.org/))

Expand All @@ -63,7 +73,7 @@ The pipeline is built using [Nextflow](https://www.nextflow.io), a workflow tool

3. Download the pipeline and test it on a minimal dataset with a single command:

```console
```bash
nextflow run nf-core/chipseq -profile test,YOURPROFILE --outdir <OUTDIR>
```

Expand All @@ -76,9 +86,7 @@ The pipeline is built using [Nextflow](https://www.nextflow.io), a workflow tool

4. Start running your own analysis!

<!-- TODO nf-core: Update the example "typical command" below used to run the pipeline -->

```console
```bash
nextflow run nf-core/chipseq --input samplesheet.csv --outdir <OUTDIR> --genome GRCh37 -profile <docker/singularity/podman/shifter/charliecloud/conda/institute>
```

Expand Down
75 changes: 46 additions & 29 deletions conf/modules.config
Original file line number Diff line number Diff line change
Expand Up @@ -54,9 +54,15 @@ process {

withName: 'UNTAR_.*' {
ext.args2 = '--no-same-owner'
publishDir = [
path: { "${params.outdir}/genome/index" },
mode: params.publish_dir_mode,
saveAs: { filename -> filename.equals('versions.yml') ? null : filename },
enabled: params.save_reference
]
}

withName: 'UNTAR_.*|BWA_INDEX|BOWTIE2_BUILD|STAR_GENOMEGENERATE' {
withName: 'BWA_INDEX|BOWTIE2_BUILD|STAR_GENOMEGENERATE' {
publishDir = [
path: { "${params.outdir}/genome/index" },
mode: params.publish_dir_mode,
Expand Down Expand Up @@ -253,7 +259,7 @@ if (params.aligner == 'chromap') {
]
}
withName: CHROMAP_CHROMAP {
ext.args = '--preset chip --SAM'
ext.args = '-l 2000 --low-mem --SAM'
ext.prefix = { "${meta.id}.Lb" }
publishDir = [
path: { "${params.outdir}/${params.aligner}/library" },
Expand Down Expand Up @@ -466,7 +472,7 @@ if (!params.skip_plot_profile) {
ext.args = 'scale-regions --regionBodyLength 1000 --beforeRegionStartLength 3000 --afterRegionStartLength 3000 --skipZeros --smartLabels'
ext.prefix = { "${meta.id}.mLb.clN" }
publishDir = [
path: { "${params.outdir}/${params.aligner}/mergedLibrary/deeptools/plotProfile" },
path: { "${params.outdir}/${params.aligner}/mergedLibrary/deepTools/plotProfile" },
mode: params.publish_dir_mode,
saveAs: { filename -> filename.equals('versions.yml') ? null : filename }
]
Expand All @@ -475,7 +481,7 @@ if (!params.skip_plot_profile) {
withName: 'DEEPTOOLS_PLOTPROFILE' {
ext.prefix = { "${meta.id}.mLb.clN" }
publishDir = [
path: { "${params.outdir}/${params.aligner}/mergedLibrary/deeptools/plotProfile" },
path: { "${params.outdir}/${params.aligner}/mergedLibrary/deepTools/plotProfile" },
mode: params.publish_dir_mode,
saveAs: { filename -> filename.equals('versions.yml') ? null : filename }
]
Expand All @@ -484,7 +490,7 @@ if (!params.skip_plot_profile) {
withName: 'DEEPTOOLS_PLOTHEATMAP' {
ext.prefix = { "${meta.id}.mLb.clN" }
publishDir = [
path: { "${params.outdir}/${params.aligner}/mergedLibrary/deeptools/plotProfile" },
path: { "${params.outdir}/${params.aligner}/mergedLibrary/deepTools/plotProfile" },
mode: params.publish_dir_mode,
saveAs: { filename -> filename.equals('versions.yml') ? null : filename }
]
Expand All @@ -508,7 +514,7 @@ if (!params.skip_plot_fingerprint) {
].join(' ').trim() }
ext.prefix = { "${meta.id}.mLb.clN" }
publishDir = [
path: { "${params.outdir}/${params.aligner}/mergedLibrary/deeptools/plotFingerprint" },
path: { "${params.outdir}/${params.aligner}/mergedLibrary/deepTools/plotFingerprint" },
mode: params.publish_dir_mode,
saveAs: { filename -> filename.equals('versions.yml') ? null : filename }
]
Expand All @@ -520,16 +526,17 @@ process {
withName: 'MACS2_CALLPEAK' {
ext.args = [
'--keep-dup all',
params.narrow_peak ? '' : "--broad --broad-cutoff ${params.broad_cutoff}",
params.save_macs_pileup ? '--bdg --SPMR' : '',
params.macs_fdr ? "--qvalue ${params.macs_fdr}" : '',
params.macs_pvalue ? "--pvalue ${params.macs_pvalue}" : ''
params.narrow_peak ? '' : "--broad --broad-cutoff ${params.broad_cutoff}",
params.save_macs_pileup ? '--bdg --SPMR' : '',
params.macs_fdr ? "--qvalue ${params.macs_fdr}" : '',
params.macs_pvalue ? "--pvalue ${params.macs_pvalue}" : '',
params.aligner == "chromap" ? "--format BAM" : ''
].join(' ').trim()
publishDir = [
path: { [
"${params.outdir}/${params.aligner}/mergedLibrary/macs2",
params.narrow_peak? '/narrowPeak' : '/broadPeak'
].join('') },
].join('') },
mode: params.publish_dir_mode,
saveAs: { filename -> filename.equals('versions.yml') ? null : filename }
]
Expand All @@ -542,7 +549,7 @@ process {
"${params.outdir}/${params.aligner}/mergedLibrary/macs2",
params.narrow_peak? '/narrowPeak' : '/broadPeak',
'/qc'
].join('') },
].join('') },
enabled: false
]
}
Expand All @@ -569,11 +576,25 @@ if (!params.skip_peak_annotation) {
path: { [
"${params.outdir}/${params.aligner}/mergedLibrary/macs2",
params.narrow_peak? '/narrowPeak' : '/broadPeak'
].join('') },
].join('') },
mode: params.publish_dir_mode,
saveAs: { filename -> filename.equals('versions.yml') ? null : filename }
]
}

withName: 'ANNOTATE_BOOLEAN_PEAKS' {
ext.prefix = { "${meta.id}_peaks" }
publishDir = [
path: { [
"${params.outdir}/${params.aligner}/mergedLibrary/macs2",
params.narrow_peak? '/narrowPeak' : '/broadPeak',
'/consensus'
].join('') },
mode: params.publish_dir_mode,
saveAs: { filename -> filename.equals('versions.yml') ? null : filename }
]

}
}

if (!params.skip_peak_qc) {
Expand All @@ -598,7 +619,7 @@ if (!params.skip_peak_annotation) {
"${params.outdir}/${params.aligner}/mergedLibrary/macs2",
params.narrow_peak? '/narrowPeak' : '/broadPeak',
'/qc'
].join('') },
].join('') },
mode: params.publish_dir_mode,
saveAs: { filename -> filename.equals('versions.yml') ? null : filename }
]
Expand All @@ -616,9 +637,8 @@ if (!params.skip_consensus_peaks) {
path: { [
"${params.outdir}/${params.aligner}/mergedLibrary/macs2",
params.narrow_peak? '/narrowPeak' : '/broadPeak',
'/consensus',
"/${meta.id}"
].join('') },
'/consensus'
].join('') },
mode: params.publish_dir_mode,
saveAs: { filename -> filename.equals('versions.yml') ? null : filename }
]
Expand All @@ -630,9 +650,8 @@ if (!params.skip_consensus_peaks) {
path: { [
"${params.outdir}/${params.aligner}/mergedLibrary/macs2",
params.narrow_peak? '/narrowPeak' : '/broadPeak',
'/consensus',
"/${meta.id}"
].join('') },
'/consensus'
].join('') },
mode: params.publish_dir_mode,
saveAs: { filename -> filename.equals('versions.yml') ? null : filename }
]
Expand All @@ -648,9 +667,8 @@ if (!params.skip_consensus_peaks) {
path: { [
"${params.outdir}/${params.aligner}/mergedLibrary/macs2",
params.narrow_peak? '/narrowPeak' : '/broadPeak',
'/consensus',
"/${meta.id}"
].join('') },
'/consensus'
].join('') },
mode: params.publish_dir_mode,
saveAs: { filename -> filename.equals('versions.yml') ? null : filename }
]
Expand All @@ -673,9 +691,8 @@ if (!params.skip_consensus_peaks) {
"${params.outdir}/${params.aligner}/mergedLibrary/macs2",
params.narrow_peak? '/narrowPeak' : '/broadPeak',
'/consensus',
"/${meta.id}",
'/deseq2'
].join('') },
].join('') },
mode: params.publish_dir_mode,
saveAs: { filename -> filename.equals('versions.yml') ? null : filename }
]
Expand All @@ -691,7 +708,7 @@ if (!params.skip_igv) {
path: { [
"${params.outdir}/igv",
params.narrow_peak? '/narrowPeak' : '/broadPeak'
].join('') },
].join('') },
mode: params.publish_dir_mode,
saveAs: { filename -> filename.equals('versions.yml') ? null : filename }
]
Expand All @@ -705,9 +722,9 @@ if (!params.skip_multiqc) {
ext.args = params.multiqc_title ? "--title \"$params.multiqc_title\"" : ''
publishDir = [
path: { [
"${params.outdir}/multiqc",
params.narrow_peak? '/narrowPeak' : '/broadPeak'
].join('') },
"${params.outdir}/multiqc",
params.narrow_peak? '/narrowPeak' : '/broadPeak'
].join('') },
mode: params.publish_dir_mode,
saveAs: { filename -> filename.equals('versions.yml') ? null : filename }
]
Expand Down
Loading

0 comments on commit 9e39ea2

Please sign in to comment.