You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
As a sanity test before I incorporate it into my pipeline I aligned a collection of viral genomes (~10K+ bases each) against themselves. To my surprise, 35% of the sequences did not have a perfect match.
For example with the attached file below, running fastANI -q vir.fa -r vir.fa -o /dev/stdout gave:
vir.fa vir.fa 100 3 4
I am seeing 100% base identity but 3 out of 4 chunks matched. Is that correct? Does that mean 100% * 3 / 4 = 75% match? How can I distinguish this case from a genome that's actually 25% shorter but matches 100%? Maybe I am misinterpreting the results?
This topic is interesting for me too.
I have nearly the same situation with bacterial genomes ,especially if a value of fraglen was changed from default (3000) to 1020. ANC_3681.fasta ANC_3681.fasta 99.9992 3432 3467 for fraglen=1020 ANC_3681.fasta ANC_3681.fasta 100 1169 1177 for fraglen=3000
Hello, thanks for making this tool!
As a sanity test before I incorporate it into my pipeline I aligned a collection of viral genomes (~10K+ bases each) against themselves. To my surprise, 35% of the sequences did not have a perfect match.
For example with the attached file below, running
fastANI -q vir.fa -r vir.fa -o /dev/stdout
gave:I am seeing 100% base identity but 3 out of 4 chunks matched. Is that correct? Does that mean 100% * 3 / 4 = 75% match? How can I distinguish this case from a genome that's actually 25% shorter but matches 100%? Maybe I am misinterpreting the results?
I hope my question is clear :)
vir.fa.gz
The text was updated successfully, but these errors were encountered: