-
-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Timeline corruption on develop - "Live timeline 0 is no longer live" #8593
Comments
@turt2live can you rageshake? |
Past related issues (not as common as whatever happened in the last 24h): |
We're declaring this not a release blocker since we have seen very limit incidence of it. One to keep an eye on, though. |
Latest symptom from https://github.com/matrix-org/riot-web-rageshakes/issues/1238 is #riot-dev was only showing ~50% of the messages received over /sync, which lead to me thinking I was replying to someone that already had an answer :( |
@richvdh observes:
|
Rageshake and reload button? |
Just sent a rageshake on this. I assume I was getting the same issue (the console error is the same but this issue doesn't describe the user facing symptoms very well). From the rageshake notes: Visible symptoms were that there were lots of old messages below the latest message in the room. Clicking the scroll to the bottom button actually scrolled you up because the latest message in the room was not at the bottom of the page but somewhere in the middle. |
That's consistent with the other known (but for some reason not documented) symptom. I assume it felt a bit like a rotating drum rather than a timeline? Edit: rageshake definitely is the same issue. |
Yes something like that. Now after I sent the rageshake the entire app has become unresponsive. I can't click on anything, the green read receipt bar doesn't update, and animations for other users read receipts is frozen. Do you want another rageshake? |
I suspect that symptom is a different issue for sure (one of the "the app sets fire to my computer trying to start up" issues), and a rageshake might not reveal it. Would suggest opening a new issue and rageshaking on that just in case though. |
To add more detail to my very brief comment above - "Rageshake and reload button?" means "Can we mitigate this difficult-to-reproduce/investigate bug by catching the error and prompting the user to 'Rageshake and reload'. Obviously this is horrible, but better than leaving users to discover the app is broken by its just being super broken and does at least present a way out. |
Fixes element-hq/element-web#9260 Workaround for element-hq/element-web#8593 Requires matrix-org/matrix-js-sdk#869 We check if any dialogs are open before moving forward because we don't want to risk showing so many dialogs that the user is unable to click a button. We're also not overly concerned if the dialog being shown is irrelevant because whatever the user is doing will likely be unaffected, and we can scream in pain when they're finished.
Fixes element-hq/element-web#9260 Workaround for element-hq/element-web#8593 Requires matrix-org/matrix-js-sdk#869 We check if any dialogs are open before moving forward because we don't want to risk showing so many dialogs that the user is unable to click a button. We're also not overly concerned if the dialog being shown is irrelevant because whatever the user is doing will likely be unaffected, and we can scream in pain when they're finished.
ftr, tracking the workaround dialog as #9260 to avoid closing this by accident. |
Daily rageshake review for explosions:
Overall the four potential causes (in terms of what the logs say) are:
|
A few more came in: https://github.com/matrix-org/riot-web-rageshakes/issues/1371, https://github.com/matrix-org/riot-web-rageshakes/issues/1372, and https://github.com/matrix-org/riot-web-rageshakes/issues/1373 all look like a case of number 3. Timelines are being spliced when they are only a few seconds old. |
Should fix the error seen in element-hq/element-web-rageshakes#1389 (element-hq/element-web#8593)
https://github.com/matrix-org/riot-web-rageshakes/issues/1389 is an explosion on develop which appears to be complaining about pagination tokens. In theory, matrix-org/matrix-js-sdk#885 fixes this. For context, the error is thrown here: |
Credit to Matthew for basically solving this. Theoretically fixes spontaneous timeline corruption: element-hq/element-web#8593 When the live timeline ends up in a position where it can no longer be live (such as becoming the second timeline in the set, rather than the first) we end up getting neighbouring timeline errors. By refusing to splice the live timeline into such a position, we hopefully keep the live timeline in a position of still being live for when it is next used. The running theory that leads to this fix is multiple limited syncs coming in, causing holes in the timeline. When trying to patch up the holes, the timeline set would end up splicing all over the place, leading to potentially splicing the live timeline into a broken position.
See element-hq/element-web#8593 (comment) Previously (#873) we allowed half-linking timelines to each other if they satisfy the conditions, however this appears to not be helping. Instead, it seems like the timelines are getting stuck in a position where one direction is spliced but the other is broken. To avoid this case, we'll just avoid splicing in both directions when one of the directions is invalid.
Only one rageshake with the new patches post-release so far: https://github.com/matrix-org/riot-web-rageshakes/issues/1403 Looks like the first error happened in 0004, and the user finally got frustrated enough to send a bug report. Looks to be a case of splicing timelines which are very close to each other, which may be an indicator of a race somewhere:
That's a 500ms difference between timelines. |
There hasn't been much activity in terms of rageshakes for this (even after we brought the rageshake server back online) - we are planning to presume this fixed without complaints saying otherwise. |
My Riot just blew up trying to (forward?)fill #irc:matrix.org, though I don't know if it is this bug. Will rageshake and if one of the Riot devs think it's not this bug, please shout. |
@Half-Shot your client is missing all of the patches which are supposed to fix this |
Ahhhhhh kk |
Concludes element-hq/element-web#8593 We are no longer seeing this error being triggered, and are considering it fixed. As a result, the dialog can be removed to reduce the amount of dead code in the project.
There haven't been any complaints and no further rageshakes for a long while, which means I'm confident enough to say this is fixed. If the issue persists for people, please open a new issue. Leaving this open to track the removal of the prompt: matrix-org/matrix-react-sdk#2939 |
I've seen this ~3 times today, and Travis is getting bitten by it repeatedly.
Symptoms are stacktrace of:
And a message you try to send gets stuck in localecho state at the bottom of the timeline (in practice it successfully sends).
The text was updated successfully, but these errors were encountered: