Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sarek bcftools normalization #1682

Open
wants to merge 36 commits into
base: dev
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
36 commits
Select commit Hold shift + click to select a range
4772da1
First modification to contribute to the bcftools/norm module in Sarek
JC-Delmas Apr 25, 2024
451aaec
Changes in the GERMLINE_VCFS_NORM process
JC-Delmas Apr 25, 2024
d97726b
Add fasta argument to POST_VARIANTCALLING process.
JC-Delmas Apr 25, 2024
e034ff0
add fasta input as argument
JC-Delmas Apr 25, 2024
8469832
remove vcfs in the GERMLINE_VCFS_NORM process, replaced by germline_v…
JC-Delmas Apr 25, 2024
e885888
First modification to contribute to the bcftools/norm module in Sarek
JC-Delmas Apr 25, 2024
9e94a05
Changes in the GERMLINE_VCFS_NORM process
JC-Delmas Apr 25, 2024
2bdba7e
Add fasta argument to POST_VARIANTCALLING process.
JC-Delmas Apr 25, 2024
1214f10
add fasta input as argument
JC-Delmas Apr 25, 2024
b7ba4f2
remove vcfs in the GERMLINE_VCFS_NORM process, replaced by germline_v…
JC-Delmas Apr 25, 2024
34bf47b
Update workflows/sarek/main.nf
JC-Delmas Apr 25, 2024
6dff9af
Resolved merge conflict by keeping changes from branch 34bf47baa9d61f…
JC-Delmas Apr 30, 2024
d289261
Refactor normalization and concatenation of VCF files
JC-Delmas Apr 30, 2024
c78af62
Modify and adjust two scripts to add normalization and integrate FAST…
JC-Delmas May 16, 2024
d646ec3
Added normalization for all vcfs
Patricie34 Oct 9, 2024
8fb64b2
Fixed linting issues and updated schema parameters
Patricie34 Oct 11, 2024
92094af
Update conf/modules/post_variant_calling.config
Patricie34 Oct 11, 2024
fbbfe1b
edit of normalization steps
Patricie34 Oct 11, 2024
24791dc
Fixed linting issues
Patricie34 Oct 15, 2024
50f1b4b
Merge remote-tracking branch 'upstream/dev' into sarek_bcftools_norm
Patricie34 Oct 15, 2024
fb4bb1e
Sync with dev_branch
Patricie34 Oct 15, 2024
a80cf11
Updated CHANGELOG.md
Patricie34 Oct 15, 2024
b0f6c12
Update conf/modules/post_variant_calling.config
Patricie34 Oct 16, 2024
3bcc27b
Update nextflow.config
Patricie34 Oct 16, 2024
f3c6ac6
Changed module.config
Patricie34 Oct 16, 2024
f9c815d
Changelog.md updated
Patricie34 Oct 16, 2024
f60d60d
Fixed params.normalize
Patricie34 Oct 16, 2024
c0a6ffc
Update CHANGELOG.md
Patricie34 Oct 18, 2024
188cf86
pytesttags.yml changed
Patricie34 Oct 18, 2024
1fe12e3
edited test_normalize_vcfs.yml
Patricie34 Oct 18, 2024
f9e5204
Separated vcf_normalization
Patricie34 Oct 22, 2024
7c96c98
Merge branch 'dev' into sarek_bcftools_norm
maxulysse Nov 4, 2024
b5909f2
module.config edited
Patricie34 Nov 5, 2024
ea7d25a
extra file removed
Patricie34 Nov 5, 2024
391f1ea
post_variantcalling edited
Patricie34 Nov 5, 2024
0bdb5d4
added annotation for vcfs_normalized
Patricie34 Nov 5, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,8 +13,9 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
- [1642](https://github.com/nf-core/sarek/pull/1642) - Back to dev
- [1653](https://github.com/nf-core/sarek/pull/1653) - Updates `sarek_subway` files with `lofreq`
- [1660](https://github.com/nf-core/sarek/pull/1642) - Add `--length_required` for minimal reads length with `FASTP`
- [1663](https://github.com/nf-core/sarek/pull/1663) - Massive conda modules update
- [1664](https://github.com/nf-core/sarek/pull/1664) - Check if flowcell ID matches for read pair
- [1663](https://github.com/nf-core/sarek/pull/1663) - Massive conda modules update
- [1680](https://github.com/nf-core/sarek/pull/1682) - Add `bcftools_norm` in `POST_VARIANTCALLING` for normalization of all vcf files or for concatenated germline vcfs

### Changed

Expand Down
35 changes: 24 additions & 11 deletions conf/modules/post_variant_calling.config
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@

process {

withName: 'GERMLINE_VCFS_CONCAT'{
withName: 'GERMLINE_VCFS_CONCAT' {
ext.args = { "-a" }
ext.when = { params.concatenate_vcfs }
publishDir = [
Expand All @@ -25,26 +25,39 @@ process {
]
}

withName: 'GERMLINE_VCFS_CONCAT_SORT'{
ext.prefix = { "${meta.id}.germline" }
ext.when = { params.concatenate_vcfs }
withName: 'GERMLINE_VCFS_CONCAT_SORT|VCFS_NORM_SORT' {
ext.prefix = { "${meta.id}.${processName == 'GERMLINE_VCFS_CONCAT_SORT' ? 'germline' : 'norm'}" }
ext.when = { params.concatenate_vcfs || params.normalize_vcfs }
publishDir = [
mode: params.publish_dir_mode,
path: { "${params.outdir}/variant_calling/concat/${meta.id}/" }
path: { "${params.outdir}/variant_calling/${processName == 'GERMLINE_VCFS_CONCAT_SORT' ? 'concat' : 'normalized'}/${meta.id}/" }
]
}

withName: 'VCFS_NORM' {
ext.args = { [
'--multiallelics - both', //split multiallelic sites into biallelic records and both SNPs and indels should be merged separately into two records
'--rm-dup all' //output only the first instance of a record which is present multiple times
].join(' ') }
ext.when = { params.normalize_vcfs }
publishDir = [
mode: params.publish_dir_mode,
path: { "${params.outdir}/variant_calling/normalized/${meta.id}/" }
]
}

withName: 'TABIX_EXT_VCF' {
ext.prefix = { "${input.baseName}" }
ext.when = { params.concatenate_vcfs }
ext.prefix = { "${input.baseName}" }
ext.when = { params.concatenate_vcfs || params.normalize_vcfs }
}

withName: 'TABIX_GERMLINE_VCFS_CONCAT_SORT'{
ext.prefix = { "${meta.id}.germline" }
ext.when = { params.concatenate_vcfs }
withName: 'TABIX_GERMLINE_VCFS_CONCAT_SORT|TABIX_VCFS_NORM_SORT' {
ext.prefix = { "${meta.id}.${processName == 'TABIX_GERMLINE_VCFS_CONCAT_SORT' ? 'germline' : 'norm'}" }
ext.when = { params.concatenate_vcfs || params.normalize_vcfs }
publishDir = [
mode: params.publish_dir_mode,
path: { "${params.outdir}/variant_calling/concat/${meta.id}/" }
path: { "${params.outdir}/variant_calling/${processName == 'TABIX_GERMLINE_VCFS_CONCAT_SORT' ? 'concat' : 'normalized'}/${meta.id}/" }
]
}
}

5 changes: 5 additions & 0 deletions modules.json
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,11 @@
"git_sha": "666652151335353eef2fcd58880bcef5bc2928e1",
"installed_by": ["bam_ngscheckmate"]
},
"bcftools/norm": {
"branch": "master",
"git_sha": "666652151335353eef2fcd58880bcef5bc2928e1",
"installed_by": ["modules"]
},
"bcftools/sort": {
"branch": "master",
"git_sha": "666652151335353eef2fcd58880bcef5bc2928e1",
Expand Down
5 changes: 5 additions & 0 deletions modules/nf-core/bcftools/norm/environment.yml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

70 changes: 70 additions & 0 deletions modules/nf-core/bcftools/norm/main.nf

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

85 changes: 85 additions & 0 deletions modules/nf-core/bcftools/norm/meta.yml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading
Loading