Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deletions being incorporated into consensus genome when not expected #83

Open
charlesfoster opened this issue Jan 20, 2021 · 3 comments

Comments

@charlesfoster
Copy link

Describe the bug
I'm working with SARS-CoV-2. I only want to incorporate indels into a variant call file with ivar variants and consensus genome generated with ivar consensus when a certain frequency of reads support the indel. I'm able to achieve this outcome withivar variants using the -t flag.

For example, consider a situation where 24% of the reads support a 10-bp deletion, and the remaining 76% of the reads support the reference bases. If I use the following command:

samtools mpileup -A -d 0 -B -Q 0 --reference NC_045512.fasta sample_1.primertrim.sorted.bam | ivar variants -p sample_1.ivar -q 20 -t 0.25 -r NC_045512.fasta -g reference_features.gff3

Then the resulting sample_1.ivar.tsv file does not include a 10-bp deletion, as desired, because of the -t 0.25 flag. However, the same does not hold true when calling the consensus genome. If I use the following command:

samtools mpileup -aa -A -d 60000 -B -Q 0 --referenceNC_045512.fasta sample_1.primertrim.sorted.bam | ivar consensus -p sample_1.consensus -n N -t 0.25

Then the resulting sample_1.consensus.fa file does include the deletion, even though I would like it not to. Is this related to #79 ?

Expected behavior
I guess the expected behaviour would be to include the reference bases if enough reads support them, or to just include Ns for the stretch of the putative deletion.

Desktop (please complete the following information):

  • OS: pop!_os (equivalent to Ubuntu 20.04 LTS)
  • Version: iVar version 1.3

Additional context
Apologies if this isn't a bug and is just me misinterpreting the settings!

Cheers,
Charles

@IsabelFE
Copy link

IsabelFE commented Jun 7, 2021

Hi @charlesfoster,

I am having some similar issues. I described them in #85.

Would you mind sharing what settings did you end up using for mpileup and ivar consensus/variants?
Also, why are you using different mpileup settings for ivar consensus vs. ivar variants?
I see that you are using --referenceNC_045512.fasta, but also -B. Aren't those supposed to cancel each other? Or I am understanding -B wrong?

Thanks

@charlesfoster
Copy link
Author

Hi @IsabelFE,

Different settings/what we're using
I inherited the pipeline we're using here, and it seems like the mpileup commands were just chosen based on the usage command given when running each tool & written in the documentation (https://andersen-lab.github.io/ivar/html/manualpage.html). Not sure why they were different previously. The samtools mpileup settings we're using at the moment for variant calling and consensus building, based on the documentation:

For ivar variants:
samtools mpileup -aa -A -d 0 -B -Q 0 --reference [<reference-fasta] <input.bam>

For ivar consensus:
samtools mpileup -aa -A -d 0 -Q 0 <input.bam>

Note: we still see issues with indel calling, as per the purpose of this Issue. I end up having to check the veracity of indels using IGV. Less than ideal.

Using -B with --reference
Sadly I don't know enough about mpileup to know whether they truly do 'cancel out'/ are incompatible.

Other stuff
There's a good chat around BAQ and variant calling with amplicon data here: https://www.biostars.org/p/9466154/#9466582. I'm trying to figure out what I might change in our pipeline based on the discussion.

Charles

@IsabelFE
Copy link

IsabelFE commented Jun 8, 2021

Thanks @charlesfoster,

We also have issues with indel calling and I agree that double checking them in IGV is not ideal.

I would really appreciate if anyone else have recommendations for ideal settings for mpileup and ivar variants/consensus for SARS-CoV-2 amplicon sequencing.

Isabel

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

2 participants