You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When creating the new pango tree based on recent designations (https://nextstrain.org/nextclade/sars-cov-2/2022-04-23?c=gt-N_203) I noticed that the only defining mutations for BA.2.3.3 on top of BA.2.3 seem to be reversions of N:203/204 back to the root (in total 3 nucleotide mutations).
To investigate, I downloaded the sequences in this branch of the Usher tree from GISAID and had a look in Nextclade.
It turns out that almost all of the ~300 sequences are characterized by not being sequenced from 29000 till the 3' end. So more than half of the N gene is missing. Only ~10 sequences contain that part (most from Germany).
Unfortunately, it appears that Usher is happy clustering these 290 sequences missing half of N there, assuming there's a reversion despite there being no evidence for this. It's really unlikely for 3 nucs to mutate back to wild type in one go. This is also not likely to arise through recombination, because if there was recombination causing absence of N:203/204 it would have to come outside of BA.1.1, that leaves only Delta, really - which would have to cause a mutation at N:377 to appear.
We would want to have some clean sequences, not missing the end and ideally clustering with extra private muts to be sure this reversion is real. One reversion can happen, but 3 in a row, all the way to wild type? Very unlikely.
When creating the new pango tree based on recent designations (https://nextstrain.org/nextclade/sars-cov-2/2022-04-23?c=gt-N_203) I noticed that the only defining mutations for BA.2.3.3 on top of BA.2.3 seem to be reversions of N:203/204 back to the root (in total 3 nucleotide mutations).
To investigate, I downloaded the sequences in this branch of the Usher tree from GISAID and had a look in Nextclade.
It turns out that almost all of the ~300 sequences are characterized by not being sequenced from 29000 till the 3' end. So more than half of the N gene is missing. Only ~10 sequences contain that part (most from Germany).
Unfortunately, it appears that Usher is happy clustering these 290 sequences missing half of N there, assuming there's a reversion despite there being no evidence for this. It's really unlikely for 3 nucs to mutate back to wild type in one go. This is also not likely to arise through recombination, because if there was recombination causing absence of N:203/204 it would have to come outside of BA.1.1, that leaves only Delta, really - which would have to cause a mutation at N:377 to appear.
We would want to have some clean sequences, not missing the end and ideally clustering with extra private muts to be sure this reversion is real. One reversion can happen, but 3 in a row, all the way to wild type? Very unlikely.
@chrisruis @AngieHinrichs
The text was updated successfully, but these errors were encountered: