Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add gprofiler #199

Merged
merged 54 commits into from
Dec 21, 2023
Merged
Show file tree
Hide file tree
Changes from 18 commits
Commits
Show all changes
54 commits
Select commit Hold shift + click to select a range
8f537bf
Added round param for all modules (currently active for proteus und g…
WackerO Oct 31, 2023
5c203cc
Merge branch 'dev' of https://github.com/nf-core/differentialabundanc…
WackerO Oct 31, 2023
1b891c5
Param rename, reordered Citations, changed gprofiler2 report heading,…
WackerO Nov 7, 2023
dbb7355
Added docs entry for gprofiler
WackerO Nov 7, 2023
a0797b7
Corrected some entries in the schema
WackerO Nov 8, 2023
0af6ce1
integrated module changes
WackerO Nov 13, 2023
f7e2f7e
Newly installed gprofiler2 module, enabled it in full test
WackerO Nov 16, 2023
454c23c
Merge branch 'dev' of https://github.com/nf-core/differentialabundanc…
WackerO Nov 16, 2023
c7b93fc
added gprofiler to git
WackerO Nov 16, 2023
ac07d6e
prettier
WackerO Nov 16, 2023
52a5d08
Added gpro to test.config
WackerO Nov 16, 2023
d72f277
changing the workflow to make gprofiler run multiple times if several…
WackerO Nov 17, 2023
c289494
Merge branch 'dev' of https://github.com/nf-core/differentialabundanc…
WackerO Nov 21, 2023
a0f7ab1
GOST module update
WackerO Nov 23, 2023
786ec25
Merge branch 'dev' of https://github.com/nf-core/differentialabundanc…
WackerO Nov 27, 2023
f1205c5
Merge branch 'dev' of https://github.com/nf-core/differentialabundanc…
WackerO Nov 27, 2023
0df5a89
Removed write statements, updated metro map
WackerO Nov 27, 2023
68695b7
Corrected metro maps, uncommented error case in workflow, fixed spell…
WackerO Nov 27, 2023
ffbae15
Renamed round_digits to report_round_digits, excluded gsea from defau…
WackerO Nov 30, 2023
57f83b8
Merge branch 'dev' of https://github.com/nf-core/differentialabundanc…
WackerO Nov 30, 2023
ced9196
fixed schema, added icons to proteus and gprofiler2
WackerO Nov 30, 2023
04c9084
Combined gprofiler with gsea section in report results; added short g…
WackerO Dec 5, 2023
1685196
Merge branch 'dev' of https://github.com/nf-core/differentialabundanc…
WackerO Dec 5, 2023
ddeed39
Changed param name in two configs
WackerO Dec 5, 2023
5250c22
Fixed some old params
WackerO Dec 5, 2023
8519c11
added difftable filtering module, added short docu section for gsea, …
WackerO Dec 8, 2023
a1fcc1c
Param renames, adapted workflow to do the correct checks; fixed a bug…
WackerO Dec 8, 2023
a9a3558
Added gprofiler organism, updated metro maps
WackerO Dec 8, 2023
0fc06ec
Uploaded slimmer SVG, fixed mistake in ifelse
WackerO Dec 8, 2023
f08dae5
Updated gprofiler
WackerO Dec 8, 2023
24e7497
Reordered report param
WackerO Dec 8, 2023
806580c
removed empty line
WackerO Dec 8, 2023
c7c55ca
adapted gprofiler2 report section to gsea, changed order of docu, add…
WackerO Dec 13, 2023
513e00b
Removed prints
WackerO Dec 13, 2023
4989dc7
Renamed gene_sets param, updated gprofiler
WackerO Dec 13, 2023
afe5f5f
Added final newline
WackerO Dec 13, 2023
f971092
Removed redundant code lines
WackerO Dec 14, 2023
299d5ab
Update modules/local/filter_difftable.nf
WackerO Dec 14, 2023
7d5d6a0
Merge branch 'add_gpro' of https://github.com/WackerO/differentialabu…
WackerO Dec 14, 2023
57bb2c5
Fixed filter module, added gprofiler background docu
WackerO Dec 14, 2023
8bae50e
Removed token from full test
WackerO Dec 14, 2023
76a628a
Updated gprofiler docu
WackerO Dec 14, 2023
f026a95
Updated output docs
WackerO Dec 15, 2023
05e98ba
Added param for selectively running gprofiler2 without GMT
WackerO Dec 18, 2023
a89bc93
Streamline gene set conditionals, don't overwrite global gene sets ch…
pinin4fjords Dec 18, 2023
b2af649
removed gprofiler mode
WackerO Dec 19, 2023
5601543
Merge branch 'add_gpro' of https://github.com/WackerO/differentialabu…
WackerO Dec 19, 2023
e1cc292
Update workflows/differentialabundance.nf
WackerO Dec 20, 2023
fe404cc
Update docs/usage.md
WackerO Dec 20, 2023
d4b6370
Added .first() to gene sets for gprofiler, made error message more sp…
WackerO Dec 20, 2023
2273996
prettier
WackerO Dec 20, 2023
4ec05c1
lint
WackerO Dec 20, 2023
18f5f44
Linting
WackerO Dec 20, 2023
68d203f
prettier
WackerO Dec 20, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
### `Added`

- [[#203](https://github.com/nf-core/differentialabundance/pull/203)] - Transcript lengths for DESeq2 ([@pinin4fjords](https://github.com/pinin4fjords), review by [@maxulysse](https://github.com/maxulysse))
- [[#199](https://github.com/nf-core/differentialabundance/pull/199)] - Add gprofiler2 module([@WackerO](https://github.com/WackerO), review by [@pinin4fjords](https://github.com/pinin4fjords))
- [[#193](https://github.com/nf-core/differentialabundance/pull/193)] - Add DESeq2 text to report ([@WackerO](https://github.com/WackerO), review by [@pinin4fjords](https://github.com/pinin4fjords))
- [[#192](https://github.com/nf-core/differentialabundance/pull/192)] - Add scree plot in report ([@WackerO](https://github.com/WackerO), review by [@pinin4fjords](https://github.com/pinin4fjords))
- [[#189](https://github.com/nf-core/differentialabundance/pull/189)] - Add DE models to report ([@WackerO](https://github.com/WackerO), review by [@pinin4fjords](https://github.com/pinin4fjords))
Expand Down
8 changes: 6 additions & 2 deletions CITATIONS.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,13 +24,17 @@

> Love MI, Huber W, Anders S (2014). Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 15(12):550. PubMed PMID: 25516281; PubMed Central PMCID: PMC4302049.

- [GEOQuery](https://pubmed.ncbi.nlm.nih.gov/17496320/)

> Davis S, Meltzer PS. Geoquery: a bridge between the gene expression omnibus (Geo) and bioconductor. Bioinformatics. 2007;23(14):1846-1847.

- [ggplot2](https://cran.r-project.org/web/packages/ggplot2/index.html)

> H. Wickham (2016). ggplot2: Elegant Graphics for Data Analysis. Springer-Verlag New York.

- [GEOQuery](https://pubmed.ncbi.nlm.nih.gov/17496320/)
- [gprofiler2](https://cran.r-project.org/web/packages/gprofiler2/index.html)

> Davis S, Meltzer PS. Geoquery: a bridge between the gene expression omnibus (Geo) and bioconductor. Bioinformatics. 2007;23(14):1846-1847.
> Kolberg L, Raudvere U, Kuzmin I, Vilo J, Peterson H (2020). “gprofiler2– an R package for gene list functional enrichment analysis and namespace conversion toolset g:Profiler.” F1000Research, 9 (ELIXIR)(709). R package version 0.2.2.

- [Limma](https://pubmed.ncbi.nlm.nih.gov/25605792/)

Expand Down
87 changes: 86 additions & 1 deletion assets/differentialabundance_report.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -47,7 +47,6 @@ params:
proteus_plotsd_method: NULL
proteus_plotmv_loess: NULL
proteus_palette_name: NULL
proteus_round_digits: NULL
affy_cel_files_archive: NULL
affy_file_name_col: NULL
affy_background: NULL
Expand Down Expand Up @@ -137,6 +136,22 @@ params:
gsea_zip_report: NULL
gsea_chip_file: NULL
gsea_gene_sets: NULL
gprofiler2_run: false
gprofiler2_organism: NULL
gprofiler2_significant: NULL
gprofiler2_measure_underrepresentation: NULL
gprofiler2_correction_method: NULL
gprofiler2_sources: NULL
gprofiler2_evcodes: NULL
gprofiler2_max_qval: NULL
gprofiler2_gmt_file: NULL
gprofiler2_gost_token: NULL
gprofiler2_background_file: NULL
gprofiler2_background_column: NULL
gprofiler2_domain_scope: NULL
gprofiler2_min_diff: NULL
gprofiler2_palette_name: NULL
round_digits: NULL
---

<!-- Load libraries -->
Expand All @@ -149,6 +164,34 @@ library(plotly)
library(DT)
```

<!-- Define some functions -->

```{r, include=FALSE}
round_dataframe_columns <- function(df, columns = NULL, digits = -1) {
if (digits == -1) {
return(df) # if -1, return df without rounding
}

df <- data.frame(df, check.names = FALSE) # make data.frame from vector as otherwise, the format will get messed up
if (is.null(columns)) {
columns <- colnames(df)[(unlist(lapply(df, is.numeric), use.names=F))] # extract only numeric columns for rounding
}

df[,columns] <- round(
data.frame(df[, columns], check.names = FALSE),
digits = digits
)

# Convert columns back to numeric

for (c in columns) {
df[[c]][grep("^ *NA$", df[[c]])] <- NA
df[[c]] <- as.numeric(df[[c]])
}
df
}
```

```{r include = FALSE}
# Load the datatables js
datatable(NULL)
Expand Down Expand Up @@ -871,6 +914,42 @@ if (any(unlist(params[paste0(possible_gene_set_methods, '_run')]))){
}
```

```{r, echo=FALSE, results='asis', eval=params$gprofiler2_run}
enrichment_files <- grep("gprofiler2", list.files(params$input_dir), value=T, fixed=T)
cat(paste0("\n### Pathway enrichment analysis {.tabset}"))
WackerO marked this conversation as resolved.
Show resolved Hide resolved
if (length(grep(".html", enrichment_files, fixed=T))) {
cat(paste0("\nThis section contains the results of the pathway analysis which was done with the R package gprofiler2. The plots show the -log10 adjusted p values of each pathway which was found to be enriched for differential genes (if necessary determined by converting feature IDs to gene IDs); if possible, pathways are grouped by their source database. The tables below give additional info for each pathway; the differential fraction is the number of differential genes in a pathway divided by that pathway's size, i.e. the number of genes annotated for the pathway.",
ifelse(params$gprofiler2_significant, paste0(" Enrichment was only considered if significant, i.e. adjusted p-value <=", params$gprofiler2_max_qval, "."), "Enrichment was also considered if not significant."), "\n"))

for (html in rev(grep(".html", enrichment_files, value=T, fixed=T))) {
contrast <- unlist(strsplit(html, "gprofiler2.", fixed=T))[2]
contrast <- unlist(strsplit(contrast, ".gostplot", fixed=T))[1]
if (! file.exists(file.path(params$input_dir, html))){
stop(paste("gprofiler2 gost plot", file.path(params$input_dir, html), "does not exist"))
}

cat(paste0("\n#### ", contrast, "\n"))
cat(paste0('<embed type="text/html" src="', file.path(params$input_dir, html), '" width="800" height="400">'))

table <- paste0("gprofiler2.", contrast, ".all_enriched_pathways.tsv")

if (! file.exists(file.path(params$input_dir, table))){
stop(paste("gprofiler2 enrichment table", file.path(params$input_dir, table), "does not exist"))
}
all_enriched <- read.table(file.path(params$input_dir, table), header=T, sep="\t", quote="\"")
all_enriched <- data.frame("Pathway name" = all_enriched$term_name, "Pathway code" = all_enriched$term_id,
"Differential features" = all_enriched$intersection_size, "Pathway size" = all_enriched$term_size,
"Differential fraction" = (all_enriched$intersection_size/all_enriched$term_size),
"Adjusted p value" = all_enriched$p_value, check.names = FALSE)
all_enriched <- round_dataframe_columns(all_enriched, digits=params$round_digits)
print(htmltools::tagList(datatable(all_enriched, caption = paste('Enriched pathways in', contrast, " (check", table, "for more detail)"), rownames = FALSE)))
cat("\n")
}
} else {
cat(paste0("\nPathway analysis which was done with the R package gprofiler2. No enriched pathways were found."))
}
```

# Methods

```{r, echo=FALSE, results='asis', eval=params$study_type == 'maxquant'}
Expand Down Expand Up @@ -937,6 +1016,12 @@ if (any(unlist(params[paste0(possible_gene_set_methods, '_run')]))){
}
```

```{r, echo=FALSE, results='asis', eval=params$gprofiler2_run}
WackerO marked this conversation as resolved.
Show resolved Hide resolved
cat("\n### Pathway enrichment analysis\n")
cat("\n#### gprofiler2\n")
make_params_table("gprofiler2", 'gprofiler2_', remove_pattern = TRUE)
```

# Appendices

## All parameters
Expand Down
42 changes: 41 additions & 1 deletion conf/modules.config
Original file line number Diff line number Diff line change
Expand Up @@ -133,7 +133,7 @@ process {
"--plotsd_method $params.proteus_plotsd_method",
"--plotmv_loess $params.proteus_plotmv_loess",
"--palette_name $params.proteus_palette_name",
"--round_digits $params.proteus_round_digits"
"--round_digits $params.round_digits"
].join(' ').trim() }
}

Expand Down Expand Up @@ -316,6 +316,46 @@ process {
].join(' ').trim() }
}

withName: GOST {
publishDir = [
[
path: { "${params.outdir}/tables/gprofiler2/${meta.id}/" },
mode: params.publish_dir_mode,
pattern: '*.tsv'
],
[
path: { "${params.outdir}/plots/gprofiler2/${meta.id}/" },
mode: params.publish_dir_mode,
pattern: '*.{png,html}'
],
[
path: { "${params.outdir}/other/gprofiler2/${meta.id}/" },
mode: params.publish_dir_mode,
pattern: '*.{rds,gmt}'
],
[
path: { "${params.outdir}/other/gprofiler2/" },
mode: params.publish_dir_mode,
pattern: '*.{rds,sessionInfo.log}'
]
]
ext.args = { [
"--significant \"${params.gprofiler2_significant}\"",
"--measure_underrepresentation \"${params.gprofiler2_measure_underrepresentation}\"",
"--correction_method \"${params.gprofiler2_correction_method}\"",
"--evcodes \"${params.gprofiler2_evcodes}\"",
"--pval_threshold \"${params.gprofiler2_max_qval}\"",
"--domain_scope ${params.gprofiler2_domain_scope}",
"--min_diff \"${params.gprofiler2_min_diff}\"",
"--round_digits ${params.round_digits}",
"--palette_name \"${params.gprofiler2_palette_name}\"",
((meta.blocking == null) ? '' : "--blocking_variables $meta.blocking"),
((params.differential_feature_id_column == null) ? '' : "--de_id_column \"${params.differential_feature_id_column}\""),
((params.gprofiler2_gost_token == null) ? '' : "--gost_token \"${params.gprofiler2_gost_token}\""),
((params.gprofiler2_background_column == null) ? '' : "--background_column \"${params.gprofiler2_background_column}\"")
].join(' ').trim() }
}

withName: PLOT_EXPLORATORY {
publishDir = [
path: { "${params.outdir}/plots/exploratory" },
Expand Down
5 changes: 5 additions & 0 deletions conf/test.config
Original file line number Diff line number Diff line change
Expand Up @@ -48,4 +48,9 @@ params {
// Activate GSEA
gsea_run = true
gsea_gene_sets = 'https://raw.githubusercontent.com/nf-core/test-datasets/modules/data/genomics/mus_musculus/gene_set_analysis/mh.all.v2022.1.Mm.symbols.gmt'

// Activate gprofiler2
WackerO marked this conversation as resolved.
Show resolved Hide resolved
gprofiler2_run = true
gprofiler2_organism = 'mmusculus'
gprofiler2_sources = 'KEGG,REAC'
}
5 changes: 5 additions & 0 deletions conf/test_full.config
Original file line number Diff line number Diff line change
Expand Up @@ -35,4 +35,9 @@ params {
// Activate GSEA
gsea_run = true
gsea_gene_sets = 'https://raw.githubusercontent.com/nf-core/test-datasets/modules/data/genomics/mus_musculus/gene_set_analysis/mh.all.v2022.1.Mm.symbols.gmt'

// Activate gprofiler2
gprofiler2_run = true
gprofiler2_organism = 'mmusculus'
gprofiler2_sources = 'KEGG,REAC'
}
Binary file modified docs/images/workflow.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading