-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Client: add metric for failed RPC calls to a consul server #4220
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@guidoiaquinti Thanks for the PR. I just have the one question but otherwise everything looks good.
agent/consul/client.go
Outdated
@@ -276,6 +276,7 @@ TRY: | |||
|
|||
// Move off to another server, and see if we can retry. | |||
c.logger.Printf("[ERR] consul: %q RPC failed to server %s: %v", method, server.Addr, rpcErr) | |||
metrics.IncrCounter([]string{"client", "rpc", "failed"}, 1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be useful to add a label here for the server that the RPC failed on? It seems like it would be a good idea because the problem here is likely on the server that failed to do the RPC. What do you think @guidoiaquinti.
Maybe something like this:
metrics.IncrCounterWithLabels([]string{"client", "rpc", "failed"}, 1, []metrics.Label{{Name: "server", Value: server.ID}})
Not sure whether server.ID, server.Name or server.Addr.String() is most appropriate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
server.Name
gets my vote - it must be unique at any point in time and is human readable. Metric labels with UUIDs in are pretty unpleasant to use in practice for at-a-glance analysis.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I read your comment after pushing my commit. I've changed the label to server.Name
as you suggested @banks
Note: we could also add both of them 🤷♂️
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me.
Add metric for failed RPC calls to a consul server