-
Notifications
You must be signed in to change notification settings - Fork 66
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: keep alive when dispatch fails #524
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks Good to me
We want to be careful not to add unnecessary configuration options. In this case, my intuition is that we can build a solution that just works for all users. What would be a permanent solution that doesn't involve adding a configuration option? |
The commit message "disableOnFail" doesn't really describe what the change does. Could you re-name the commit message to be more descriptive? It sounds like we are trying to solve the problem of recovering from network problems. |
For zero configuration, we can remove |
To add we should not disable RUM for 401 (Unauthorized error) , I had the usecase where due to invalid/expired token , fetch is not able to redirect to token provider website and refresh the token. |
Looks like our docs may have to be updated. Another definition could be: disable on everything that isn't 429 or 5xx. |
Revision: I have updated the PR so that RUM only disables when Dispatch fails with 403 or 404. Those are the only status codes in which we know RUM will never succeed. In all other cases, RUM should stay alive. |
So now, if my understanding is correct, the behavior is to:
I have two questions.
Those questions aren't blocking. I think this is a good change and can be merged. |
Almost, RUM only retries on 429 and 5xx. Then the payload is dropped.
RUM is not sending 401 status codes, so I didn't include it. But I can include that in case someone's proxy uses that code.
Dropping payloads is not the behavior we want long term, but I haven't implemented it because there are still open questions: A. How should dropped events be reinserted into the queue and what priority should they be given against the limit? |
Added 401! |
* fix: Allow title override in page attributes (#508) * chore(deps-dev): bump ip from 1.1.5 to 1.1.9 (#513) Bumps [ip](https://github.com/indutny/node-ip) from 1.1.5 to 1.1.9. - [Commits](indutny/node-ip@v1.1.5...v1.1.9) --- updated-dependencies: - dependency-name: ip dependency-type: indirect ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * chore(deps-dev): bump follow-redirects from 1.15.4 to 1.15.6 (#525) Bumps [follow-redirects](https://github.com/follow-redirects/follow-redirects) from 1.15.4 to 1.15.6. - [Release notes](https://github.com/follow-redirects/follow-redirects/releases) - [Commits](follow-redirects/follow-redirects@v1.15.4...v1.15.6) --- updated-dependencies: - dependency-name: follow-redirects dependency-type: indirect ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * chore(deps-dev): bump webpack-dev-middleware from 6.0.1 to 6.1.2 (#528) Bumps [webpack-dev-middleware](https://github.com/webpack/webpack-dev-middleware) from 6.0.1 to 6.1.2. - [Release notes](https://github.com/webpack/webpack-dev-middleware/releases) - [Changelog](https://github.com/webpack/webpack-dev-middleware/blob/v6.1.2/CHANGELOG.md) - [Commits](webpack/webpack-dev-middleware@v6.0.1...v6.1.2) --- updated-dependencies: - dependency-name: webpack-dev-middleware dependency-type: direct:development ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * feat: retry with exponential backoff (#501) * feat: retry with exponential backoff * fix * test * feat: limit retries to 5xx and 429 (#500) * feat: limit PutRumEvents retry to 5xx * feat: add 429 * cleanup * cleanup * docs * doc * fix: fmt * chore(deps-dev): bump express from 4.18.1 to 4.19.2 (#531) Bumps [express](https://github.com/expressjs/express) from 4.18.1 to 4.19.2. - [Release notes](https://github.com/expressjs/express/releases) - [Changelog](https://github.com/expressjs/express/blob/master/History.md) - [Commits](expressjs/express@4.18.1...4.19.2) --- updated-dependencies: - dependency-name: express dependency-type: indirect ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * feat: keep alive when dispatch fails (#524) * feat: disableOnFail * doc: add config doc * Revert "doc: add config doc" This reverts commit ec1f74f. * Revert "feat: disableOnFail" This reverts commit 67ed7b3. * feat: only disable RUM when dispatch fails with 403 or 404 * chore: add unit tests * feat: add 401 * fix: record resources with invalid names (#532) * fix: Ignore URL construction error from invalid performance resource event * fix: Throw error when URL construction fails for invalid performance resource event * fix: Ignore error thrown from URL construction * test: add unit test * fix: record resources with invalid names * fix: Update unit test for invalid url * fix: Update hostname typo in isPutRumEventsCall tests --------- Co-authored-by: Billy <[email protected]> * chore(release): 1.17.2 * Revert "Sync changes from main for 1.17.2 patch release" * fix: record resources with invalid names (#532) * fix: Ignore URL construction error from invalid performance resource event * fix: Throw error when URL construction fails for invalid performance resource event * fix: Ignore error thrown from URL construction * test: add unit test * fix: record resources with invalid names * fix: Update unit test for invalid url * fix: Update hostname typo in isPutRumEventsCall tests --------- Co-authored-by: Billy <[email protected]> * Update changelog --------- Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: Quinn Hanam <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Billy <[email protected]>
Looks like I missed this, although I already pointed out why we are still disabling the RUM for 401 code, 401 is a valid usecase where the proxy server might return with (Unauthorized Token) and browser might retry later with a valid token. Please help to remove 401. |
Addresses an immediate problem in #521, highlighted in #521 (comment)
To summarize, RUM currently disables itself when the attempt to PutRumEvents (PRT) fails and all retries have been exhausted. The intention was to avoid giving the web application needless errors when we know that RUM the network connection is disabled anyways.
The problem is that sometimes the network connection recovers, and RUM will experience data loss because it is needlessly disabled. This PR solves that problem by adding configuration for
disableOnFail
(default true). When site operator decides to set this to false, then RUM will not disable on PRT failure, and be able to resume sending events when the network connection recovers.This PR should be shipped out with #500 and #501 to limit the number of unnecessary retries and errors.
In the long run, RUM should listen for
online
andoffline
DOM events (MDN docs). When offline, we should write events to localStorage instead of the PRT endpoint. When online,Dispatch
should attempt to combine the current batch with events recorded offline.By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.