-
-
Notifications
You must be signed in to change notification settings - Fork 654
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sending outbox messages is fraught with issues #3881
Comments
Huh fascinating! How does this one happen?
Definitely possible in principle: I think the scenario would be that a send request succeeds on the server, but the response doesn't make it to us; and the new-message event (which we make into Is that the kind of scenario you have in mind, or is there another path? All of these would be good to address. I imagine a single solution may address several of them at once. |
OK, and in particular:
This one is #2374. There's some nice discussion on that thread, too.
I think this is the same issue as #3731, describing another aspect of the symptoms. (Which would be a nice point to add to that thread.) |
On reread, I think that's actually not possible at the moment... but only thanks to other bugs. At least, not unless there's an exception-generating bug in
There is, sadly. The message-sending tasks are completely loose – their conclusion is unordered with the rest of the program, so there's no sequencing between them and This can all be fixed on our end, at least in theory. The scenario you described is another matter, and may require API changes to prevent. (Though
I would be very wary of any solution that did not. |
They're certainly closely related, but only because they'll need essentially the same data to be stored in order to fix. (See below.) It's otherwise technically possible to fix either without the other – and in particular I wouldn't want the fix for the network behavior to be blocked on having an appropriate display form. It's probably appropriate to copy my comment there over here, though:
|
Aha. There's actually a good reason this can't happen. That flag specifically appears at Quoting from /**
* Properties on the global store which we explicitly choose not to persist.
*
* All properties on the global store should appear either here or in the
* lists of properties we do persist, below.
*/
// prettier-ignore
export const discardKeys: Array<$Keys<GlobalState>> = [
'alertWords', 'caughtUp', 'fetching',
'nav', 'presence', 'session', 'topics', 'typing', 'userStatus',
]; That makes |
I see. I think you're referring to the fact that this export const trySendMessages = (dispatch: Dispatch, getState: GetState): boolean => {
// ...
try {
outboxToSend.forEach(async item => {
// ...
await api.sendMessage(auth, { is in a export const sendOutbox = () => async (dispatch: Dispatch, getState: GetState) => {
// ...
dispatch(toggleOutboxSending(true));
while (!trySendMessages(dispatch, getState)) {
await progressiveTimeout(); // eslint-disable-line no-await-in-loop
}
dispatch(toggleOutboxSending(false));
}; we set One nuance:
I don't think this is quite right. They'll all be fired twice before any responses to the send-attempts are received. But remember that So it should be the case that when you call Still, firing twice before seeing any responses is clearly not the desired behavior here either. |
Looking again at this code, I think the basic problem is that that In particular the boolean return from But also, if that loop is changed to be a real JS loop so that the
(For the third, it fixes the cause of it you had in mind, though not the other cause I then mentioned above.) So, seems like we should do that! I also went and took a look at the history of this file, following my speculations above about what the authors were thinking. 😉 The logic became the way it is in #3272 -- which in particular was supposed to fix #3259, sending messages out of order. Before that, #1079 introduced the |
Except that if we do – or at least, if we just do that – then that causes another bug to surface: the presence of an unsendable message in the Outbox will prevent any other messages from being sent for a week, after which all sendable messages (up to the next unsendable message, anyway) will suddenly be flushed to the server at once. (This would have been worse before the recent 'discard unsendable messages after a week' change. Back then it would just have silently prevented any further messages from ever being sent at all.) |
This is a pretty subtle aspect of how the app works. I've just pushed 664ee09, which adds a bit more discussion of this in a place which hopefully helps make it somewhat easier to discover. |
Yeah, we should behave differently on
In the latter case, we should retry with backoff, and keep other outbox messages in the queue behind this one, along the lines of what this code was intended to do. In the former, we should treat the message as permanently unsendable. One more-immediate thing we could do, to unblock fixing the rest of the logic without regressing anything, would be: on a 4xx status, move on to the next outbox message in the queue, but leave this one there, basically like we (accidentally) do in the current code. |
Write a complete replacement for `trySendMessages`. This is not yet hooked in. Will fix: zulip#3881, (others...)
This loop doesn't do what it looks like it does. It looks like a loop that tries to send one message, awaits that, then tries the next, and so on. In fact, because Array#forEach does nothing with the return values of its callback -- in particular it doesn't notice if they're promises or anything else, and doesn't await them -- the actual behavior is to fire off a bunch of requests in parallel, wait for nothing, and ignore errors. That behavior isn't good, and we should fix it; that's zulip#3881. For a start, just make the code more explicit about what it actually does: desugar the async/await keywords to the equivalent use of Promise#then, and add a comment for good measure. This also smooths the way to rewriting Array#forEach here as part of an automated sweep across our code.
This loop doesn't do what it looks like it does. It looks like a loop that tries to send one message, awaits that, then tries the next, and so on. In fact, because Array#forEach does nothing with the return values of its callback -- in particular it doesn't notice if they're promises or anything else, and doesn't await them -- the actual behavior is to fire off a bunch of requests in parallel, wait for nothing, and ignore errors. That behavior isn't good, and we should fix it; that's zulip#3881. For a start, just make the code more explicit about what it actually does: desugar the async/await keywords to the equivalent use of Promise#then, and add a comment for good measure. This also smooths the way to rewriting Array#forEach here as part of an automated sweep across our code.
This loop doesn't do what it looks like it does. It looks like a loop that tries to send one message, awaits that, then tries the next, and so on. In fact, because Array#forEach does nothing with the return values of its callback -- in particular it doesn't notice if they're promises or anything else, and doesn't await them -- the actual behavior is to fire off a bunch of requests in parallel, wait for nothing, and ignore errors. That behavior isn't good, and we should fix it; that's zulip#3881. For a start, just make the code more explicit about what it actually does: desugar the async/await keywords to the equivalent use of Promise#then, and add a comment for good measure. This also smooths the way to rewriting Array#forEach here as part of an automated sweep across our code.
It'll be easier to write a test for it, when we eventually attempt that, probably pending resolution of zulip#3881. In particular, without this change, Flow would complain if we tried to pass `store.dispatch` -- where `store` is mocked using redux-mock-store -- as the first argument, for which our custom `Dispatch` type is expected. I suspect this could be fixed by tightening up the redux-mock-store libdef's `Dispatch` type. But, for consistency's sake, `trySendMessages` should probably be a thunk action creator in any case. In e5268bb (and confirmed again with logging just now), we observed that the return value of our function (in this case a boolean) will be passed through and returned by the `dispatch` call.
It'll be easier to write a test for it, when we eventually attempt that, probably pending resolution of zulip#3881. In particular, without this change, Flow would complain if we tried to pass `store.dispatch` -- where `store` is mocked using redux-mock-store -- as the first argument, for which our custom `Dispatch` type is expected. I suspect this could be fixed by tightening up the redux-mock-store libdef's `Dispatch` type. But, for consistency's sake, `trySendMessages` should probably be a thunk action creator in any case. In e5268bb (and confirmed again with logging just now), we observed that the return value of our function (in this case a boolean) will be passed through and returned by the `dispatch` call.
I think the next step, when we pick this thread of work up again, will be to take that chat thread from 2020-12 and make a fresh draft of the proposed design. (@chrisbobbe sent a design up at the top of the thread, and then there was further discussion but nobody has compiled a revised draft all in one place.) That will include a state diagram, akin to this one, and a type definition, like this one crossed with the full draft from the start of the thread. |
I've picked this back up again, here. I've written a new state diagram but I'd like to go through some concerns with it before doing the other parts. |
Here's our latest state diagram, from discussion:
|
I just had a situation where a user sending 4 messages turned into 20 due to a connectivity issue with the server... I assume that #5525 is also related to this. |
…Error This is our minimal support for zulip#5870 in this legacy codebase. Supporting it properly with a client-side check of the setting is more effort than we can spare right now. Probably more error handling is called for, but zulip#3881 ("Sending outbox messages is fraught with issues") is complicated and it's probably best to leave it be. Fixes: zulip#5870
…Error This is our minimal support for zulip#5870 in this legacy codebase. Supporting it properly with a client-side check of the setting is more effort than we can spare right now. Probably more error handling is called for in general (like for network or server issues), but zulip#3881 ("Sending outbox messages is fraught with issues") is complicated and it's probably best to leave it be. Fixes: zulip#5870
…Error This is our minimal support for zulip#5870 in this legacy codebase. Supporting it properly with a client-side check of the setting is more effort than we can spare here, because it requires implementing the group-based permissions system. Probably more error handling is called for in general (like for network or server issues), but zulip#3881 ("Sending outbox messages is fraught with issues") is complicated and it's probably best to leave it be. Fixes: zulip#5870
There is a small constellation of intertwined problems with message-sending.
The text was updated successfully, but these errors were encountered: