Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Necessity to Sort Bam #1415

Open
eprdz opened this issue Nov 8, 2024 · 1 comment
Open

Necessity to Sort Bam #1415

eprdz opened this issue Nov 8, 2024 · 1 comment

Comments

@eprdz
Copy link

eprdz commented Nov 8, 2024

Hello,
I am using the Whole Genome Germline Single Sample workflow for big WGS experiments. I notices that the task of sorting bam after MarkDuplicates is consuming from 60% to 80% of the execution time of the workflow. I was wondering if this step is 100% necessary and if it is some possibility to speed this process for example using samtools sort instead of Picard SortSam.

Thank you in advance.

@jessicaway
Copy link
Member

Hi @eprdz,

I believe the sorting is needed for downstream steps, but @kachulis may be able to comment.

For sorting tools, yes, samtools sort is in fact faster than SortSam (especially running in parallel). We hope to get to optimization of our WGS pipeline soon, however that work is likely on the order of months rather than weeks for our team. Feel free to fork the repo and make the changes you need in the meantime

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants