-
Notifications
You must be signed in to change notification settings - Fork 103
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Initial gmp exporter #342
Initial gmp exporter #342
Conversation
8f649f3
to
8ee2e5a
Compare
89c96d9
to
7c81e04
Compare
This is ready for review |
9d849bb
to
d856407
Compare
Now that the integration tests have a dependency on exporterhelper, it added some additional self-obs metrics to the integration tests. |
I confirmed that metrics from the GMP integration test are queryable via the "Managed Service for Prometheus" UI in the GCP console. |
ff64376
to
aa61cbd
Compare
RetrySettings: exporterhelper.NewDefaultRetrySettings(), | ||
QueueSettings: exporterhelper.NewDefaultQueueSettings(), | ||
GMPConfig: GMPConfig{ | ||
UserAgent: "opentelemetry-collector-contrib {{version}}", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note GMP convention for setting user agents: https://github.com/GoogleCloudPlatform/prometheus-engine/blob/4adcfbad6e704e02ea0f4c76e14f197fa8ec78c7/pkg/export/export.go#L217
May want to be consistent here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 to a consistent layout, however also +1 to making sure we know "opentelemetry" is involved somewhere in the UA (for diagnosing bugs).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changed to opentelemetry-collector-contrib/{version} to match formatting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you also want to structure it so it can be extended to include the source in the future? e.g., it may be useful to know whether the metrics are coming from Ops Agent vs otherwise.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is settable via config options. The ops-agent could set it to: user_agent: ops-agent-gmp/{{version}}
if they wanted, for example.
@@ -0,0 +1,115 @@ | |||
// Copyright 2022 Google LLC |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unit testing is useful, but it may be valuable to run GMP collector and GMP exporter side-by-side in the same testing environment and compare the exported metrics to see whether the configured resource labels for GMP exporter match expectations (and resemble the GMP collector where it makes sense)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good. We can definitely do that. If there are particular setups you would like me to test, let me know.
a2789c4
to
df2c3f9
Compare
Confirmed that summary metrics appear correctly in the Managed Prometheus promql window. |
return &monitoredrespb.MonitoredResource{ | ||
Type: "prometheus_target", | ||
Labels: map[string]string{ | ||
"location": getStringOrEmpty(attrs, semconv.AttributeCloudAvailabilityZone), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. I'd like to call attention to this comment block: https://github.com/GoogleCloudPlatform/prometheus-engine/blob/main/pkg/export/setup/setup.go#L77. Specifically on GKE we may not be able to always use the underlying node's zone as the location as we may have both a zonal cluster and a regional cluster in the same project with the same name, so if we use the node's zone location we'll end up creating conflicts.
Labels: map[string]string{ | ||
"location": getStringOrEmpty(attrs, semconv.AttributeCloudAvailabilityZone), | ||
"cluster": getStringOrEmpty(attrs, semconv.AttributeK8SClusterName), | ||
"namespace": getStringOrEmpty(attrs, semconv.AttributeK8SNamespaceName), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. If users do configure a relabel rule to set the namespace
label, which takes precedence?
"cluster": getStringOrEmpty(attrs, semconv.AttributeK8SClusterName), | ||
"namespace": getStringOrEmpty(attrs, semconv.AttributeK8SNamespaceName), | ||
"job": job, | ||
"instance": getStringOrEmpty(attrs, semconv.AttributeServiceInstanceID), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I generally see the OT setup w/ Prometheus receiver and GMP exporter resembles the self-deployed GMP experience as opposed to the managed experience. If so, using whatever instance
label is set by user's Prometheus configuration makes sense as is done here, but we should note this creates a different experience from managed GMP. cc/ @jsuereth
e4d07ca
to
928ef75
Compare
I think this should be all ready to go. @xichen2020 can you take a final pass? |
job = serviceNamespace + "/" + job | ||
} | ||
location := getStringOrEmpty(attrs, semconv.AttributeCloudAvailabilityZone) | ||
if location == "" { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add a TODO and link to the issue mentioning the resource detection logic fix? Note that's also a breaking change for the exporter once merged here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nothing needs to be done in the exporter, which already has the correct behavior. Once the fix is in the resource detector, it will work correctly without any additional changes here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does that mean the GKE resource detector will use cluster-location
instance attribute, determine whether the location is a zone or region (how feasible is this?), and set the zone/region attribute?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that is exactly what it will do. I have a WIP detector that does this for SDK detection if you are curious. It should be feasible to do.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. Would it be more robust to add a new location
resource attribute and use that? That said I don't know how difficult it is to do so.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Resource attributes are defined by the otel semantic conventions: https://github.com/open-telemetry/opentelemetry-specification/blob/main/semantic_conventions/resource/cloud.yaml#L45. Right now, it defines zone + region. I suspect it would be difficult to convince them to add a third convention, "location", which is the zone or region.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. It seems possible to add this as a GKE-specific attribute, but maybe that's an overkill.
Labels: map[string]string{ | ||
"location": getStringOrEmpty(attrs, semconv.AttributeCloudAvailabilityZone), | ||
"cluster": getStringOrEmpty(attrs, semconv.AttributeK8SClusterName), | ||
"namespace": getStringOrEmpty(attrs, semconv.AttributeK8SNamespaceName), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using target namespace is usually okay but I want to mention the kube-state-metrics use case where the namespace
of the metric (not the namespace of the target) is more appropriate as the resource label. I'll leave it to you whether that's a use case the exporter should support.
"cluster": getStringOrEmpty(attrs, semconv.AttributeK8SClusterName), | ||
"namespace": getStringOrEmpty(attrs, semconv.AttributeK8SNamespaceName), | ||
"job": job, | ||
"instance": getStringOrEmpty(attrs, semconv.AttributeServiceInstanceID), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Understood. I was more saying managed GMP does it behind the scenes, whereas self-deployed GMP users would need to configure the relabeling rules themselves.
The new exporter adds type suffixes, and maps to the the prometheus_target monitored resource. It wraps the configuration exposed by the standard googlecloud exporter, and only exposes a subset of that configuration. The remaining fields are hard-coded to ensure a GMP-compatible export.
Customizing the reset logic will be done after #360 merges in a follow-up change.