-
Notifications
You must be signed in to change notification settings - Fork 93
Add basic monitoring server to Mixer. #708
Add basic monitoring server to Mixer. #708
Conversation
This PR establishes a server within Mixer for providing some basic monitoring and version information for debugging. This solution is meant ONLY to be a short term patch while we iterate on a more thorough and complete design for health and status reporting for Mixer as well as for self-monitoring of Istio components. It provides two basic endpoints: - `/metrics` which exposes data from the default collector for prometheus. This should include stats on number of goroutines and cpu usage, etc. - `version` which provides a quick way to get a snapshot of the version info for Mixer.
Jenkins job mixer/presubmit passed |
1 similar comment
Jenkins job mixer/presubmit passed |
Jenkins job mixer/manager-regression passed |
1 similar comment
Jenkins job mixer/manager-regression passed |
cmd/server/cmd/server.go
Outdated
@@ -89,6 +94,9 @@ func serverCmd(printf, fatalf shared.FormatFn) *cobra.Command { | |||
} | |||
serverCmd.PersistentFlags().Uint16VarP(&sa.port, "port", "p", 9091, "TCP port to use for Mixer's gRPC API") | |||
serverCmd.PersistentFlags().Uint16VarP(&sa.configAPIPort, "configAPIPort", "", 9094, "HTTP port to use for Mixer's Configuration API") | |||
serverCmd.PersistentFlags().Uint16Var(&sa.monitoringPort, "monitoringPort", 1337, "HTTP port to use for the exposing mixer self-monitoring information") | |||
serverCmd.PersistentFlags().StringVar(&sa.metricsPath, "metricsPath", "/metrics", "Request path for metrics data for mixer self-monitoring") | |||
serverCmd.PersistentFlags().StringVar(&sa.versionPath, "versionPath", "/version", "Request path for version info for mixer self-monitoring") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think all these paths should be configurable.
We may want the context root to be configurable so "/health" or /status"
and then we can have
/status/metrics and /status/version underneath it.
we should just use the statusz/healthz
endpoint paths that we will eventually support.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we know what paths we will eventually support. These don't have to have a common root. I can hardcode them for now, if that is preferable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 on hardcoding (also /metrics).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hardcoded is good.
cmd/server/cmd/server.go
Outdated
// is coming. that design will include proper coverage of statusz/healthz type | ||
// functionality, in addition to how mixer reports its own metrics. | ||
srvMux := http.NewServeMux() | ||
srvMux.Handle(sa.metricsPath, promhttp.Handler()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does the default promhttp handler give us all the goodies we want?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It uses the default gatherer, which includes two basic collectors for free:
NewProcessCollector returns a collector which exports the current state of process metrics including cpu, memory and file descriptor usage as well as the process start time for the given process id under the given namespace.
NewGoCollector returns a collector which exports metrics about the current go process.
I think this is good enough to start (especially for a bridge solution to a more full-fledged solution).
cmd/server/cmd/server.go
Outdated
monitoring := &http.Server{Addr: fmt.Sprintf(":%d", sa.monitoringPort), Handler: srvMux} | ||
printf("Starting self-monitoring on port %d", sa.monitoringPort) | ||
go func() { | ||
if monErr := monitoring.Serve(monitoringListener.(*net.TCPListener)); monErr != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
any reason why this cannot be glog.Fatal (monitoring.Server ...)
or similar ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no. but, do we want to make a failure to start the monitoring service fatal?
cmd/server/cmd/server.go
Outdated
@@ -89,6 +94,9 @@ func serverCmd(printf, fatalf shared.FormatFn) *cobra.Command { | |||
} | |||
serverCmd.PersistentFlags().Uint16VarP(&sa.port, "port", "p", 9091, "TCP port to use for Mixer's gRPC API") | |||
serverCmd.PersistentFlags().Uint16VarP(&sa.configAPIPort, "configAPIPort", "", 9094, "HTTP port to use for Mixer's Configuration API") | |||
serverCmd.PersistentFlags().Uint16Var(&sa.monitoringPort, "monitoringPort", 1337, "HTTP port to use for the exposing mixer self-monitoring information") | |||
serverCmd.PersistentFlags().StringVar(&sa.metricsPath, "metricsPath", "/metrics", "Request path for metrics data for mixer self-monitoring") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we use consecutive ports by default?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we can use whatever ports we think are reasonable. is your suggestion:
- grpc: 9091
- config: 9092
- monitoring: 9093
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we avoid adding a new port instead ? AFAIK grpc allows adding http handlers.
( I assume the config API will move/migrate to manager or run as separate service at some point)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
grpc-go reusing the same port is a mess. perhaps most troubling: grpc/grpc-go#586.
Jenkins job mixer/e2e-smoketest passed |
1 similar comment
Jenkins job mixer/e2e-smoketest passed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM - few cosmetic comments.
cmd/server/cmd/server.go
Outdated
@@ -89,6 +94,9 @@ func serverCmd(printf, fatalf shared.FormatFn) *cobra.Command { | |||
} | |||
serverCmd.PersistentFlags().Uint16VarP(&sa.port, "port", "p", 9091, "TCP port to use for Mixer's gRPC API") | |||
serverCmd.PersistentFlags().Uint16VarP(&sa.configAPIPort, "configAPIPort", "", 9094, "HTTP port to use for Mixer's Configuration API") | |||
serverCmd.PersistentFlags().Uint16Var(&sa.monitoringPort, "monitoringPort", 1337, "HTTP port to use for the exposing mixer self-monitoring information") | |||
serverCmd.PersistentFlags().StringVar(&sa.metricsPath, "metricsPath", "/metrics", "Request path for metrics data for mixer self-monitoring") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we avoid adding a new port instead ? AFAIK grpc allows adding http handlers.
( I assume the config API will move/migrate to manager or run as separate service at some point)
"os" | ||
"strings" | ||
"time" | ||
|
||
bt "github.com/opentracing/basictracer-go" | ||
"github.com/prometheus/client_golang/prometheus/promhttp" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we also add _ "expvar" ? It doesn't hurt, and mostly free. Just needs the default http handler to be exposed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added it for now, though I am skeptical it provides more detail than the newly-exposed /metrics endpoint.
cmd/server/cmd/server.go
Outdated
@@ -89,6 +94,9 @@ func serverCmd(printf, fatalf shared.FormatFn) *cobra.Command { | |||
} | |||
serverCmd.PersistentFlags().Uint16VarP(&sa.port, "port", "p", 9091, "TCP port to use for Mixer's gRPC API") | |||
serverCmd.PersistentFlags().Uint16VarP(&sa.configAPIPort, "configAPIPort", "", 9094, "HTTP port to use for Mixer's Configuration API") | |||
serverCmd.PersistentFlags().Uint16Var(&sa.monitoringPort, "monitoringPort", 1337, "HTTP port to use for the exposing mixer self-monitoring information") | |||
serverCmd.PersistentFlags().StringVar(&sa.metricsPath, "metricsPath", "/metrics", "Request path for metrics data for mixer self-monitoring") | |||
serverCmd.PersistentFlags().StringVar(&sa.versionPath, "versionPath", "/version", "Request path for version info for mixer self-monitoring") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 on hardcoding (also /metrics).
cmd/server/cmd/server.go
Outdated
// is coming. that design will include proper coverage of statusz/healthz type | ||
// functionality, in addition to how mixer reports its own metrics. | ||
srvMux := http.NewServeMux() | ||
srvMux.Handle(sa.metricsPath, promhttp.Handler()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Curious - any reason we're not using the default mux ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not particularly. updating to use default.
PTAL. I believe I have addressed most of the review comments. Changes:
|
Jenkins job mixer/presubmit passed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
Jenkins job mixer/manager-regression passed |
Jenkins job mixer/e2e-smoketest passed |
This PR establishes a server within Mixer for providing some basic
monitoring and version information for debugging. This solution is
meant ONLY to be a short term patch while we iterate on a more
thorough and complete design for health and status reporting for Mixer
as well as for self-monitoring of Istio components.
It provides two basic endpoints:
/metrics
which exposes data from the default collector forprometheus. This should include stats on number of goroutines and cpu
usage, etc.
version
which provides a quick way to get a snapshot of the versioninfo for Mixer.
Example snippet from
/metrics
:Example output from
/version
:This change is