-
Notifications
You must be signed in to change notification settings - Fork 499
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
services/horizon: Add new metrics counters for db connection close events #5225
Conversation
Instead of having 3 separate prometheus counters for these specific errors, I think it would be better to have one prometheus counter for all possible postgres errors returned by the driver. This could be implemented by having a label in the prometheus counter which represents the postgres error code. Then, you can wrap the session functions with some code which checks if the returned error is a postgres server error ( https://stackoverflow.com/questions/37560534/does-the-error-returned-by-db-exec-have-a-code ) and , in that case, you can increment the metric with the appropriate error code label. |
I re-worked it per suggestion for single metric with labels, the new metrics gathering routine attempts to get pg server error code if libpq provides it. |
…g on metrics mapping
…mapping to metric condition
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks great! I just left one comment about capturing the context error nil case #5225 (comment)
PR Checklist
PR Structure
otherwise).
services/friendbot
, orall
ordoc
if the changes are broad or impact manypackages.
Thoroughness
.md
files, etc... affected by this change). Take a look in the
docs
folder for a given service,like this one.
Release planning
needed with deprecations, added features, breaking changes, and DB schema changes.
semver, or if it's mainly a patch change. The PR is targeted at the next
release branch if it's not a patch change.
What
Added new db layer metrics:
client_closed_session_total
server_timeout_closed_session_total
statement_timeout_closed_session_total
Why
obtain insights on what types of events underly db sessions timeouts
Closes #5217
Known limitations