Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix #1018 #1037

Merged
merged 1 commit into from
May 31, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -239,7 +239,8 @@ jobs:
strategy:
matrix:
parameters:
- "--skip_qc --skip_alignment"
- "--skip_qc"
- "--skip_alignment --skip_pseudo_alignment"
- "--salmon_index false --transcript_fasta false"
steps:
- name: Check out pipeline code
Expand Down
11 changes: 11 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,9 +8,20 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
### Enhancements & fixes

- [[#1011](https://github.com/nf-core/rnaseq/issues/1011)] - FastQ files from UMI-tools not being passed to fastp
- [[#1018](https://github.com/nf-core/rnaseq/issues/1018)] - Ability to skip both alignment and pseudo-alignment to only run pre-processing QC steps.
- [PR #1016](https://github.com/nf-core/rnaseq/pull/1016) - Updated pipeline template to [nf-core/tools 2.8](https://github.com/nf-core/tools/releases/tag/2.8)
- [PR #1025](https://github.com/nf-core/fetchngs/pull/1025) - Add `public_aws_ecr.config` to source mulled containers when using `public.ecr.aws` Docker Biocontainer registry

### Parameters

| Old parameter | New parameter |
| ------------- | ------------------------- |
| | `--skip_pseudo_alignment` |

> **NB:** Parameter has been **updated** if both old and new parameter information is present.
> **NB:** Parameter has been **added** if just the new parameter information is present.
> **NB:** Parameter has been **removed** if new parameter information isn't present.

### Software dependencies

| Dependency | Old version | New version |
Expand Down
2 changes: 1 addition & 1 deletion conf/modules.config
Original file line number Diff line number Diff line change
Expand Up @@ -1137,7 +1137,7 @@ if (!params.skip_multiqc) {
// Salmon pseudo-alignment options
//

if (params.pseudo_aligner == 'salmon') {
if (!params.skip_pseudo_alignment && params.pseudo_aligner == 'salmon') {
process {
withName: '.*:QUANTIFY_SALMON:SALMON_QUANT' {
ext.args = params.extra_salmon_quant_args ?: ''
Expand Down
4 changes: 3 additions & 1 deletion docs/usage.md
Original file line number Diff line number Diff line change
Expand Up @@ -71,6 +71,8 @@ When running Salmon in mapping-based mode via `--pseudo_aligner salmon` the enti

Two additional parameters `--extra_star_align_args` and `--extra_salmon_quant_args` were added in v3.10 of the pipeline that allow you to append any custom parameters to the STAR align and Salmon quant commands, respectively. Note, the `--seqBias` and `--gcBias` are not provided to Salmon quant by default so you can provide these via `--extra_salmon_quant_args '--seqBias --gcBias'` if required.

> **NB:** You can use `--skip_alignment --skip_pseudo_alignment` if you only want to run the pre-processing QC steps in the pipeline like FastQ, trimming etc. This will skip alignment, pseudo-alignment and any post-alignment processing steps.

## Quantification options

The current options align with STAR and quantify using either Salmon (`--aligner star_salmon`) / RSEM (`--aligner star_rsem`). You also have the option to pseudo-align and quantify your data with Salmon by providing the `--pseudo_aligner salmon` parameter.
Expand Down Expand Up @@ -133,7 +135,7 @@ If unique molecular identifiers were used to prepare the library, add the follow

Please refer to the [nf-core website](https://nf-co.re/usage/reference_genomes) for general usage docs and guidelines regarding reference genomes.

The minimum reference genome requirements for this pipeline are a FASTA and GTF file, all other files required to run the pipeline can be generated from these files. However, it is more storage and compute friendly if you are able to re-use reference genome files as efficiently as possible. It is recommended to use the `--save_reference` parameter if you are using the pipeline to build new indices (e.g. custom genomes that are unavailable on [AWS iGenomes](https://nf-co.re/usage/reference_genomes#custom-genomes)) so that you can save them somewhere locally. The index building step can be quite a time-consuming process and it permits their reuse for future runs of the pipeline to save disk space. You can then either provide the appropriate reference genome files on the command-line via the appropriate parameters (e.g. `--star_index '/path/to/STAR/index/'`) or via a custom config file.
The minimum reference genome requirements for this pipeline are a FASTA and GTF file, all other files required to run the pipeline can be generated from these files. However, it is more storage and compute friendly if you are able to re-use reference genome files as efficiently as possible. It is recommended to use the `--save_reference` parameter if you are using the pipeline to build new indices (e.g. custom genomes that are unavailable on [AWS iGenomes](https://nf-co.re/usage/reference_genomes#custom-genomes)) so that you can save them somewhere locally. The index building step can be quite a time-consuming process and it permits their reuse for future runs of the pipeline to save disk space. You can then either provide the appropriate reference genome files on the command-line via the appropriate parameters (e.g. `--star_index '/path/to/STAR/index/'`) or via a custom config file. Another option is to run the pipeline once with `--save_reference --skip_alignment --skip_pseudo_alignment` to generate and save all of the required reference files and indices to the results directory. You can then move the reference files in `<RESULTS_DIR>/genome/` to a more permanent location and use these paths to override the relevant parameters in the pipeline e.g. `--star_index`.

- If `--genome` is provided then the FASTA and GTF files (and existing indices) will be automatically obtained from AWS-iGenomes unless these have already been downloaded locally in the path specified by `--igenomes_base`.
- If `--gff` is provided as input then this will be converted to a GTF file, or the latter will be used if both are provided.
Expand Down
5 changes: 1 addition & 4 deletions lib/WorkflowRnaseq.groovy
Original file line number Diff line number Diff line change
Expand Up @@ -65,13 +65,10 @@ class WorkflowRnaseq {
Nextflow.error("Invalid option: '${params.aligner}'. Valid options for '--aligner': ${valid_params['aligners'].join(', ')}.")
}
} else {
if (!params.pseudo_aligner) {
Nextflow.error("--skip_alignment specified without --pseudo_aligner...please specify e.g. --pseudo_aligner ${valid_params['pseudoaligners'][0]}.")
}
skipAlignmentWarn(log)
}

if (params.pseudo_aligner) {
if (!params.skip_pseudo_alignment) {
MatthiasZepper marked this conversation as resolved.
Show resolved Hide resolved
if (!valid_params['pseudoaligners'].contains(params.pseudo_aligner)) {
Nextflow.error("Invalid option: '${params.pseudo_aligner}'. Valid options for '--pseudo_aligner': ${valid_params['pseudoaligners'].join(', ')}.")
} else {
Expand Down
1 change: 1 addition & 0 deletions nextflow.config
Original file line number Diff line number Diff line change
Expand Up @@ -71,6 +71,7 @@ params {
save_align_intermeds = false
skip_markduplicates = false
skip_alignment = false
skip_pseudo_alignment = false

// QC
skip_qc = false
Expand Down
5 changes: 5 additions & 0 deletions nextflow_schema.json
Original file line number Diff line number Diff line change
Expand Up @@ -446,6 +446,11 @@
"type": "boolean",
"fa_icon": "fas fa-fast-forward",
"description": "Skip all of the alignment-based processes within the pipeline."
},
"skip_pseudo_alignment": {
"type": "boolean",
"fa_icon": "fas fa-fast-forward",
"description": "Skip all of the pseudo-alignment-based processes within the pipeline."
}
}
},
Expand Down
8 changes: 4 additions & 4 deletions workflows/rnaseq.nf
Original file line number Diff line number Diff line change
Expand Up @@ -43,9 +43,9 @@ if (!params.skip_bbsplit && !params.bbsplit_index && params.bbsplit_fasta_list)

// Check alignment parameters
def prepareToolIndices = []
if (!params.skip_bbsplit) { prepareToolIndices << 'bbsplit' }
if (!params.skip_alignment) { prepareToolIndices << params.aligner }
if (params.pseudo_aligner) { prepareToolIndices << params.pseudo_aligner }
if (!params.skip_bbsplit) { prepareToolIndices << 'bbsplit' }
if (!params.skip_alignment) { prepareToolIndices << params.aligner }
if (!params.skip_pseudo_alignment) { prepareToolIndices << params.pseudo_aligner }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you should have put params.pseudo_aligner && !params.skip_pseudo_alignment here, because params.pseudo_aligner is null by default?

So it will add null to prepareToolIndices most of the time. This may or may not cause issues, depending on how prepareToolIndices is used later, but I think it preferable to avoid that in the first place?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are also doing parameter validation for --pseudo_aligner and --aligner via the Nextflow schema which should fail sooner before hitting this logic which is why I hadn't included it. But we can be explicit 👍🏽

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done 993fff6

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are also doing parameter validation for --pseudo_aligner and --aligner via the Nextflow schema which should fail sooner before hitting this logic which is why I hadn't included it. But we can be explicit 👍🏽

Admittedly, those peculiarities escaped me. I just assumed that !params.skip_pseudo_alignment will also evaluate to true when no pseudoalignment is performed at all (because the user performs a regular run with alignment).


// Get RSeqC modules to run
def rseqc_modules = params.rseqc_modules ? params.rseqc_modules.split(',').collect{ it.trim().toLowerCase() } : []
Expand Down Expand Up @@ -799,7 +799,7 @@ workflow RNASEQ {
ch_salmon_multiqc = Channel.empty()
ch_pseudoaligner_pca_multiqc = Channel.empty()
ch_pseudoaligner_clustering_multiqc = Channel.empty()
if (params.pseudo_aligner == 'salmon') {
if (!params.skip_pseudo_alignment && params.pseudo_aligner == 'salmon') {
QUANTIFY_SALMON (
ch_filtered_reads,
PREPARE_GENOME.out.salmon_index,
Expand Down