Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BF.5 Sublineage with >20 private mutations, possible molnupiravir origin, Japan (22 seq, Jan 27) #1526

Closed
ryhisner opened this issue Jan 6, 2023 · 7 comments
Assignees
Labels
BA.5 designated recommended Recommended for designation by pango team member Saltation Appears on long branch length with no intermediates
Milestone

Comments

@ryhisner
Copy link

ryhisner commented Jan 6, 2023

Description

Sub-lineage of: BF.5
Earliest sequence: 2022-10-24 — EPI_ISL_15673587
Most recent sequence: 2022-12-23 — EPI_ISL_16374103
Countries circulating: Japan (All), 14 from Mie Prefecture, 1 from Saitama
Number of Sequences: 15
GISAID Query: Nucleotide Search: T1965C, T2158C, G3719A, A7863G
CovSpectrum Query: [8-of: T1965C, T2158C, G3719A, A7863G, C9442T, T11045C, G16360A, A20463G, C21178T, T22711C, C23170T, C25702T, T28157C, A28877T, G28878C]
Substitutions on top of BF.5:
Spike: None
ORF1a: I567T, G1152S, K2533R, I2873V, T4217I
ORF1b: V965I, H2471Y
ORF3a: P104S
Nucleotide: T1965C, T2158C, G3719A, A7863G, A8882G, C9442T, T11045C, C12915T, G16360A, C16954T, A20463G, A20472G, C21178T, T22090C, T22711C, C23170T, C25702T, T28157C, A28877T, G28878C, C29077T

USHER Tree
*Note: There are multiple ways to define the long/saltation branch for this cluster. I’ve appended my prolix notes on this subject at the bottom of this post.
https://nextstrain.org/fetch/raw.githubusercontent.com/ryhisner/jsons/main/Molnupiravir%20-%20BF.5%20-%2015%20seq%2C%20Japan%20-%20subtreeAuspice1_genome_1e7d2_801770.json
image

Evidence
This BF.5 cluster consists of 15 sequences from Japan, 14 of them from Mie Prefecture and one from Saitama.. The long branch has 23 private nucleotide mutations, 21 of which are transitions. But the two transversions are peculiar: they're A28877T and G28878C, which create the N-gene sgmRNA TRS extended homology. I've never seen those two nucleotide mutations acquired stepwise—they ALWAYS occur on the same branch—so I think they probably were acquired through some sort of recombination, though I'm not certain.

One aspect that distinguishes this cluster from previous molnupiravir-redolent clusters is the extended time period over which this lineage has circulated. The first sequence was collected on October 24, 2022, and the most recent 60 days later, on December 23, 2022. And while most of the cases are from the same region (Mie Prefecture), one of the most recent came from Saitama, which does not border Mie.

My usual molnupiravir-candidate analysis for this cluster is pictured below.

image

The case for molnupiravir is less airtight here than for most of these sequences due to the lower proportion of G->A and C->T mutations, but I still think MOV is pretty clearly the cause for a few reasons:

#1. The rapid accumulation of mutations in a very short time period. With one exception, I've never seen such a rapid accumulation of mutations in any non-MOV sequence, and I’ve pored over thousands of such sequences.

#2. The extraordinary dominance of transitions, including an unusually high T->C proportion. While G->A and C->T mutations are most characteristic of molnupiravir treatment, T->C mutations were also notably increased in the clinical trial data.

image

#3. The extremely low proportion of non-synonymous mutations (34.8%), less than half the proportion typically seen in non-MOV saltation branches like this. Even if I were to count the two extended homology nucleotide mutations as non-synonymous (as they are individually, though together they are synonymous), the non-synonymous percentage is still well below 50%, which is unheard of in chronic-infection saltations.

#4. The very low percentage of AA mutations in spike—0% in this case. Non-MOV Long branches (all or nearly all of which derive from chronic infections) typically have somewhere around half of their AA mutations in spike.

As this lineage appears to still be circulating, I feel it should be monitored in the coming weeks.

Genomes

Genomes EPI_ISL_15673587, EPI_ISL_16210278, EPI_ISL_16210314-16210315, EPI_ISL_16210321, EPI_ISL_16210344, EPI_ISL_16210352, EPI_ISL_16210356, EPI_ISL_16279922, EPI_ISL_16374061-16374064, EPI_ISL_16374077, EPI_ISL_16374103

*I’ve included the two mutations preceding the long branch on the Usher tree because there are nine equally parsimonious placements for this cluster, and in order to place this branch where it is, Usher had to posit that it acquired ORF8:S67F mutation followed by reversion of the same. Thus I feel pretty sure this isn’t the correct placement and that the nucleotide mutations A8882G and C16954T are therefore very unlikely to actually have been inherited by both this cluster and the two sequences from New Jersey, USA, as the tree above claims.

There’s also some ambiguity about what ought to be included at the end of the branch. The earliest sequence (by almost a month) is on the bottom branch, but this branch only includes two of the 15 sequences. The top branch has one sequence that interrupts the long branch’s path to the other 12 sequences, and it’s not clear if it belongs where it is on the tree or whether it belongs with the others but merely lacks coverage in the areas of the two mutations on the branch connecting it to the 12 others. (None of these sequences indicate where there is missing coverage.) In the analysis below, I’ve selected the top branch and have assumed the intervening sequence is real, meaning I excluded the two mutations (both transitions) leading to the largest cluster of 12.

In any case, the choice of how to define the saltation branch here only marginally affects the molnupiravir analysis; all of the mutations on these branches that may or may not belong in the saltation branch are transitions. The only decision that has some impact on the MOV statistics is whether or not to include the two transversion mutations that make up the N-gene sgmRNA TRS extended homology. I’ve included those transversions in the analysis, but I believe these two consecutive nucleotide mutations are typically acquired via recombination rather than stepwise mutation. One could make the case that they therefore should be excluded from the MOV analysis. If they are excluded, then 100% of the nucleotide mutations on the long branch are transitions. However, I’ve decided to include them in the analysis below because I’m not entirely certain this TRS sgmRNA extended homology motif is always acquired via recombination (though of the hundreds of times I’ve seen it on Usher trees, it’s always been acquired in one leap, never stepwise).

@thomasppeacock thomasppeacock added BA.5 Saltation Appears on long branch length with no intermediates labels Jan 6, 2023
@FedeGueli
Copy link
Contributor

Great catch @ryhisner this one with a relative low count of AA mutations and likely extended circulation through months 8 and during a wave) worries me .

@corneliusroemer corneliusroemer added the monitor currently too small, watch for future developments label Jan 9, 2023
@thomasppeacock thomasppeacock added recommended Recommended for designation by pango team member and removed monitor currently too small, watch for future developments labels Jan 14, 2023
@thomasppeacock
Copy link

Looks like this is still growing - due to that and the saltation I think it would be useful to assign for easier tracking

@ryhisner
Copy link
Author

Five new sequences added today from Japan, all with three additional AA mutations: ORF3:T32I, ORF7a:A8V, and the highly convergent ORF1b:G662S. There are still zero spike mutations in this lineage, which I think is unprecedented for a branch anywhere close to this length.
image

@ryhisner ryhisner changed the title BF.5 Sublineage with >20 private mutations, possible molnupiravir origin, Japan (15 seq, Jan 6) BF.5 Sublineage with >20 private mutations, possible molnupiravir origin, Japan (20 seq, Jan 14) Jan 17, 2023
@ryhisner
Copy link
Author

ryhisner commented Jan 18, 2023

Twelve of the 133 sequences—9%—uploaded on or after December 12 from Mie Prefecture have been from this lineage, and 6/22 sequences (27%) uploaded on or after December 29. Today there was just one sequence uploaded from Mie, and it was from this lineage. EPI_ISL_16545014

@ryhisner ryhisner changed the title BF.5 Sublineage with >20 private mutations, possible molnupiravir origin, Japan (20 seq, Jan 14) BF.5 Sublineage with >20 private mutations, possible molnupiravir origin, Japan (21 seq, Jan 18) Jan 18, 2023
@ryhisner
Copy link
Author

The last two uploads from Mie prefecture haven't had any from this lineage, but yesterday the first sequence from Shiga, Japan, was uploaded. This is the third prefecture this lineage has been sequenced in. EPI_ISL_16675393

@ryhisner ryhisner changed the title BF.5 Sublineage with >20 private mutations, possible molnupiravir origin, Japan (21 seq, Jan 18) BF.5 Sublineage with >20 private mutations, possible molnupiravir origin, Japan (22 seq, Jan 27) Jan 27, 2023
@InfrPopGen InfrPopGen self-assigned this Feb 8, 2023
InfrPopGen added a commit that referenced this issue Feb 8, 2023
Added new lineage BF.5.3 from #1526 with 22 new sequence designations, and 0 updated
@InfrPopGen InfrPopGen added this to the BF.5.3 milestone Feb 8, 2023
@InfrPopGen
Copy link
Contributor

Thanks for submitting. We've added lineage BF.5.3 with 22 newly designated sequences, and 0 updated. Defining mutations T1965C (ORF1a:I567T), G3719A (ORF1a:G1152S), A7863G (ORF1a:K2533R), G16360A (ORF1b:V965I), C21178T (ORF1b:H2571Y), T2158C, T11045C, C29077T (following C12915T (ORF1a:T4217I), C25702T (ORF3a:P104S), C9442T, A20463G, A20472G, T22090C, T22711C, T28157C)!

@ryhisner
Copy link
Author

ryhisner commented Feb 17, 2023

A new BF.5.3 sequence appeared today in Japan from a different prefecture than all previous ones: Nara. EPI_ISL_16949682

The collection date is January 23, nearly a month after the most recent previous sequence, which means this lineage has been circulating for at least three months (October 24 to January 23).

New, Updated Usher Tree (as of 2023-2-17)
https://nextstrain.org/fetch/raw.githubusercontent.com/ryhisner/jsons/main/MOV_BF.5.3_as_of_2023_2_17_subtreeAuspice1_genome_38d9d_f6d370.json

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
BA.5 designated recommended Recommended for designation by pango team member Saltation Appears on long branch length with no intermediates
Projects
None yet
Development

No branches or pull requests

5 participants