-
Notifications
You must be signed in to change notification settings - Fork 97
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update to designations of lineage B.1.575 and B.1.575.1 #64
Comments
Now tagged as release: https://github.com/cov-lineages/pango-designation/releases/tag/v1.1.23 |
@aineniamh Thanks for your work I this. I spot checked a few of the sequences I flagged manually and they are still B.1 in the new release lineages.csv. Investigating the first three (below), I think the issue is they have gaps (including our sequence) over "orf1ab:T265I". Because of this they were not included in your reassignment pull above (they have all of the other mutations). My suggesting would be to remove the samples like this (those originally reassigned from B.1.617 to B.1 but not included in above) from the training data as I think they will just be creating trouble for both lineages. |
@aineniamh I know you closed this but I believe the reassignment of samples from #49 to B.1 is still causing issues. I ran our custom Oregon build and included many of these still assigned to B.1 (that is not reassigned above). You can see from these screenshots that we have genotypes at identical places in the tree getting misassigned. |
Hi @oroak, thanks for providing the trees and IDs. I've now removed the following from designations:
I'm not certain what these sequences will be assigned once trained, but it sounds like they should be B.1.575 and it would be straightforward to flag a misassignment as due to missing data. |
Lineage description
USA and Aruba lineage with spike mutations "S:P681H","S:S494P" and "S:T716I".
From all SARS-CoV-2 genomes on GISAID as of 2021-04-25, any sequences with at least 8 of the SNPs below were aligned with a selection of B.1 genomes to place them within B.1 diversity. These sequences include previously designated B.1.575 sequences and a couple of sequences that had previously been designated as B.1.1.335.
The above phylogeny shows B.1.575 in red and B.1.575.1 in blue, both as monophyletic clades. The annotated tree is available here: [sequences.aln.fasta.tree.zip](https://github.com/cov-lineages/pango-designation/files/6389986/sequences.aln.fasta.tree.zip)
New designations:
Now total designations:
Changes enacted in commit:
8a5d954
B.1.575.new_lineages.csv
The text was updated successfully, but these errors were encountered: