-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
mixin(Store): handle ResourceExhausted as a non-server error #6218
mixin(Store): handle ResourceExhausted as a non-server error #6218
Conversation
Signed-off-by: Douglas Camata <[email protected]>
Signed-off-by: Douglas Camata <[email protected]>
Signed-off-by: Douglas Camata <[email protected]>
Thanks for the fix. Looks like the docs job failure is related to the changes. |
Signed-off-by: Douglas Camata <[email protected]>
@fpetkovski fixed in aa2ade6, thanks for the heads up. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want to have a panel for client errors as well? My only concern is that we have no good way to monitor this status code now.
@fpetkovski the dashboard is whole different problem that I plan to tackle soon. In this PR I only change alerts and rules for this reason. We have a dashboard widget that shows different error timeseries (see below), but all labelled as "error". It's a pattern that repeats in grpc and http error charts. It's not that easy to fix it without breaking something else because there is a lot of reuse of functions ( thanos/examples/dashboards/receive.json Lines 148 to 152 in 2d6b0d4
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree it makes no sense to include this as server error, although I'm also wary of #6218 (review) (but sounds like @douglascamata has a plan 😎).
Co-authored-by: Matej Gera <[email protected]> Signed-off-by: Douglas Camata <[email protected]>
@matej-g @fpetkovski if I cannot find a good way of fixing all the widgets at once and for all, I will definitely patch out the gRPC errors widget to not account for |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just some remaining nits 👍
Signed-off-by: Douglas Camata <[email protected]>
…usted-fix Signed-off-by: Douglas Camata <[email protected]>
@matej-g finally got the build to be all green, without any intermittent failure. PTAL. 🙇 |
Co-authored-by: Philip Gough <[email protected]> Signed-off-by: Douglas Camata <[email protected]>
Signed-off-by: Douglas Camata <[email protected]>
Signed-off-by: Douglas Camata <[email protected]>
Signed-off-by: Douglas Camata <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Small nit for changelog entry, otherwise good to go 🎸
Signed-off-by: Douglas Camata <[email protected]>
Changes
ResourceExchausted
as a server error. It's an error being used whenever a request is limited. Because of this, it shouldn't help trigger theThanosStoreGrpcErrorRate
alert.Verification