-
Notifications
You must be signed in to change notification settings - Fork 227
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Push component should automatically recover from the server losing its uaid #3695
Comments
The docs and structure of the push component and overdue for a bit of an overhaul, we sadly haven't really kept up very good maintenance of it since it was initially written by the push team. However, that's an undertaking far beyond the scope of this bug! (But if you happen to find opportunities to clean up the docs or clarify things as you go here, that's also valuable). I've added some high-level context-setting in #4303 which I hope will help a bit. Specifically for this bug, we want to deal with the unhandled error being reported from Fenix users here and here:
This error occurs when the push server (for whatever reason) throws away our existing registration, forgetting everything about the Unfortunately the management of We need the app to be in control of the process here, because we're going to be re-creating a bunch of subscription records, and the app needs to be able to route the updated data to the relevant consumers elsewhere in the app (such as the FxA client, or service workers). This is what the So, at the top level of a fix here, the app needs to be able to catch this error and call Next level down, we need to implement some recovery logic in the underlying Rust
Where things get fuzzy for me is in step (3) here. It's not clear to me what the API contract actually is in practice. The Rust code is just returning the list of existing subscriptions, so I guess it's expecting the calling code to re-create each one. But the example Kotlin code here suggests that the returned list contains newly-created subscription info and all the app has to do is route those new details to appropriate places. The wrapper code in a-c certainly seem worded in as way that suggests the subscriptions have already been re-created. But then when it gets turned into an So...I think we maybe it's fine for |
Will have to come back to this issue after PTO, but I wanted to add one note:
You might find this meta bug I had filed also valuable #2763 of mostly nits during the initial integration of the push component into AC/Fenix/Fire TV. |
Thanks @jonalmeida! Sorry for a lack of context on my previous comment - I'm going to be PTO for the week after the wellness week, but we're hoping that @tarikeshaq will be able to spend some time on this bug during that week, so I as leaving as much async context as I could. This can all definitely wait until after PTO. |
Yes, this sounds fine to me. If it makes it easier, our current logic/contract is to call
In the KDoc, "updated" may not have been the best choice of words in hind sight. My understanding that, Then, as you correctly followed in the linked code, we want to use that list to inform our consumers (FxA and service workers) that the subscription they relied on no longer exists, so it's up to them to decide if they want to renew that subscription or not. In FxA's case, we handle that bit of logic in I just remembered that we have a nice little paragraph in the README specifically about the current behaviour and understanding of verify_connection so we don't have to keep that mental model all the time. 😄 |
Thanks a ton @jonalmeida!! I'll probably add a bit more error handling to make sure we can catch the error, but after an initial look I THINK I might have a good theory on why we are ending up with this error in the first place, @jrconlin could possibly verify my theory here 😄
A super easy fix is I'll try is to simply set the But all that is beside the point that we should be recovering from problems like this automatically - I'll jot down some more thoughts specifically on that and possibly open a draft PR to look at |
As noted in #3314 (comment) and subsequent comments, if the autopush server receives a
404
or410
error when trying to delivery a push via one of the mobile bridges, it will completely drop the registration record for that uaid. AFAICT any attempts by the client to operate on that uaid will produce an error.I don't think the push component currently handles this case well. In fact I don't see any codepaths that would case us to detect that
self.uaid
has become invalid and is being rejected by the server.This seems like something that could be handled in
verify_connection
by:We should also add some handling of this case in the
update
method, which clients may call when in this state in an attempt to repair this push subscriptions.┆Issue is synchronized with this Jira Bug
┆Epic: FxA Ecosystem (backlog)
┆Fix Versions: Release 92
┆Sprint End Date: 2021-07-09
The text was updated successfully, but these errors were encountered: