Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Environment Variable Resource Detection in Kubernetes #195

Closed
wants to merge 13 commits into from

Conversation

dashpole
Copy link
Contributor

This OTEP proposes using a consistent set of environment variables to detect resource attributes available through the Kubernetes downward API. The Kubernetes downward API is designed to serve this function and is the most consistent way to detect Kubernetes resource attributes. This will improve user experience, and enable re-use of the same Kubernetes configuration across languages and when using the collector.

@dashpole
Copy link
Contributor Author

cc @Aneurysm9 @pmm-sumo @dmitryax
who might be interested

Copy link
Member

@dmitryax dmitryax left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for putting it together!

text/0195-kubernetes-resource-detection.md Outdated Show resolved Hide resolved
Copy link

@anuraaga anuraaga left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just for reference we currently use OTEL_RESOURCE_ATTRIBUTES in the operator

https://github.com/open-telemetry/opentelemetry-operator/blob/a94a13888165ec688a5f83d4e444e9fbc934fca2/pkg/instrumentation/sdk.go#L144

If there were different recognized variables we would probably switch to it though, as far as I can tell it's not practical to set that purely with YAML so having dedicated env vars seems worth it.

@pmm-sumo
Copy link

I think that the downward API can currently also expose labels and annotations. While it's not included in the semantic conventions, k8sattributesprocessor can currently extract those into resource attributes (and this information is frequently very useful).

I am wondering if there's a space to include such capability via env resource detection. Currently, the labels/annotations are filtered by k8sattributesprocessor but perhaps we could include those in a separate namespace and let filtering happen (if needed) using e.g. resourceprocessor

@dashpole
Copy link
Contributor Author

dashpole commented Jan 28, 2022

If there were different recognized variables we would probably switch to it though, as far as I can tell it's not practical to set that purely with YAML so having dedicated env vars seems worth it.

Oh, wow. I just realized you can define dependent env vars, which might be a better solution than i'm proposing.

I could add:

- name: OTEL_RESOURCE_ATTRIBUTES
  value: k8s.pod.name=$(K8S_POD_NAME),k8s.pod.uid=$(K8S_POD_UID),k8s.namespace.name=$(K8S_NAMESPACE_NAME),k8s.node.name=$(K8S_NODE_NAME)

to achieve the goals of this proposal, without requiring any detectors to be installed. I've added this as an alternative for now.

@mat-rumian
Copy link

@dashpole I would like to share my opinion. What do you think about configuring additional env var holding resource attributes related to k8s in separated var like:

- name: OTEL_RESOURCE_ATTRIBUTES_K8S
  value: k8s.pod.name=$(K8S_POD_NAME),k8s.pod.uid=$(K8S_POD_UID),k8s.namespace.name=$(K8S_NAMESPACE_NAME),k8s.node.name=$(K8S_NODE_NAME)

and use it this way:

- name: OTEL_RESOURCE_ATTRIBUTES
  value: $(OTEL_RESOURCE_ATTRIBUTES_K8S),other=attribs...

I think it will be simpler, transparent and easier to handle and modify.

@dashpole
Copy link
Contributor Author

dashpole commented Jan 28, 2022

I think that the downward API can currently also expose labels and annotations. ... I am wondering if there's a space to include such capability via env resource detection.

If we went with the current proposal (explicitly defined env vars that are detected with a kubernetes detector), we could also support K8S_LABELS and K8S_ATTRIBUTES. The kubernetes detector could accept a configuration option to allow elevating parsed labels or attributes to resource attributes.

If we used dependent environment variables, as we are discussing above (no detector needed), users could just fetch labels and attributes themselves:

- name: MY_CUSTOM_ATTRIBUTE
   valueFrom:
     fieldRef:
       fieldPath: metadata.labels['my-custom-attribute']
- name: OTEL_RESOURCE_ATTRIBUTES
  value: other.semantic.convention=$(MY_CUSTOM_ATTRIBUTE)

There are a lot of labels/attributes that would come in if the resourcedetectionprocessor added them be default. Filtering later seems tedious (e.g. what if my cluster admin adds a new label)...

WDYT?

@dashpole
Copy link
Contributor Author

dashpole commented Jan 28, 2022

What do you think about configuring additional env var holding resource attributes related to k8s in separated var like ...

That seems like a perfectly reasonable thing to do, practically speaking. In terms of this proposal, I think the core question is whether or not we want to specify a new resource detector. I'd prefer to keep the yaml simpler for readability, if thats acceptable.


Many kubernetes detectors currently use `HOSTNAME` environment variable, which defaults to the Pod name. However, the `HOSTNAME` can be [modified in a few ways](https://kubernetes.io/docs/concepts/services-networking/dns-pod-service/#pod-s-hostname-and-subdomain-fields) in the pod spec. Kubernetes resource detectors may fall back to detecting the pod name using `HOSTNAME` if `K8S_POD_NAME` is not available, but this may cause user confusion in some cases.

### Alternative: Using Dependent Environment Variables
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tigrannajaryan
@jsuereth mentioned to me that this approach may not interact well with schemas, since OTEL_RESOURCE_ATTRIBUTES is merged into resource attributes without any notion of version according to the specification. Using this approach would seem to remove the benefits of #161. Is there a way to make OTEL_RESOURCE_ATTRIBUTES produce resource attributes associated with a particular version of the semantic conventions?

text/0195-kubernetes-resource-detection.md Show resolved Hide resolved
text/0195-kubernetes-resource-detection.md Show resolved Hide resolved

SDK's should add a Kubernetes environment variable-based resource detector. The Kubernetes resource detector and the collector's `resourcedetectionprocessor` should support the following environment variables, and map them to the following semantic conventions:

| Environment Variable | Semantic Convention |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if these environment variable names will run into conflicts. Should we prefix these with OTEL_?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have a strong opinion. Is there precedent from other detectors?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think there is a precedent. Since SDK resource detectors would need to interpret them, would you consider these SDK environment variables?

Copy link
Member

@dmitryax dmitryax Jan 28, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Likely an instrumented application would need these vars from downward API as well, but it's very unlikely that they would expect any different value. So conflict here is rather useful, otherwise they would need to repeat all these definitions with OTEL_ prefix and without.

The question here is whether we want to support an application that uses K8S_POD_NAME to represent a different value like deployment name or something else.

If we want to read these env vars from instrumentation libraries as is, we can probably support both prefixed (OTEL_K8S_POD_NAME) and not prefixed, defined as preferred (K8S_POD_NAME) just to not block use cases like the one described above. If we are still going to use only OTEL_RESOURCE_ATTRIBUTES in instrumentation libraries and compose it with k8s.pod.name=$(K8S_POD_NAME),k8s.pod.uid=$(K8S_POD_UID), I think we shouldn't use any prefix, users always can rename them if really needed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking about this more, I think we should switch to OTEL_ prefixed env vars. The reason isn't that someone might want opentelemetry to attach attributes that don't match your own identity. This is mostly relevant for container name (e.g. sidecar), but I could also imagine it for pod or even namespace.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dashpole do you mean that OTEL_K8S_CONTAINER_NAME env var added to one container would mean something else other than a name of the container which it's added to, like name of an otel sidecar when added to an application container?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, thats what I mean. container name isn't part of this proposal, but would be a likely future extension of this.

Copy link
Member

@dmitryax dmitryax Feb 8, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO in that case meaning of OTEL_K8S_CONTAINER_NAME becomes a bit confusing and inconsistent with other OTEL_K8S_* env vars. I, as a user, would expect OTEL_K8S_CONTAINER_NAME to have a value of the container where env var is defined, giving that other OTEL_K8S_* have values of other k8s objects where the env var is defined.

I think we should use another name for the sidecar use case, not OTEL_K8S_CONTAINER_NAME

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think multiple env vars for the same semantic convention would work, as resource detectors will ultimately have to choose one of the two to set in that resource attribute.

If we named it without a prefix, I'm not sure that would stop users from treating the env vars as a way to set attribute X in OTEL, even if we would prefer they use another mechanism in that scenario. We just risk introducing naming conflicts if we don't prefix.

text/0195-kubernetes-resource-detection.md Show resolved Hide resolved
Copy link
Member

@pavolloffay pavolloffay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This proposal is related to open-telemetry/opentelemetry-specification#2135 which proposes support for OTEL_RESOURCE_ATTRIBUTES_* env var. These env vars are added to OTEL_RESOURCE_ATTRIBUTES`

e.g.

OTEL_RESOURCE_ATTRIBUTES_K8S_POD_NAME=myapp
OTEL_RESOURCE_ATTRIBUTES=foo=bar

the final value of OTEL_RESOURCE_ATTRIBUTES would be foo=bar,k8s_pod_name=myapp

text/0195-kubernetes-resource-detection.md Show resolved Hide resolved
@bogdandrutu
Copy link
Member

@dashpole very nice OTEP, please consider to also refer to that proposal of supporting OTEL_RESOURCE_ATTRIBUTES_*

@dashpole
Copy link
Contributor Author

dashpole commented Feb 4, 2022

please consider to also refer to that proposal of supporting OTEL_RESOURCE_ATTRIBUTES_*

Done.

@seh
Copy link

seh commented May 6, 2022

On the subject of the difficulty of knitting together several environment variables into a final "OTEL_RESOURCE_ATTRIBUTES" variable, please see open-telemetry/opentelemetry-specification#1982.

Here's how I handled this in a kustomization I wrote recently, with the following in the "base" manifest for a Deployment:

- name: _OTEL_RESOURCES_ATTRIBUTES_UNDERLAY
  value: ""
- name: _OTEL_RESOURCES_ATTRIBUTES_OVERLAY
  value: ""
- name: OTEL_RESOURCE_ATTRIBUTES
  value: >-
    $(_OTEL_RESOURCES_ATTRIBUTES_UNDERLAY)
    k8s.container.name=my-thing,
    k8s.deployment.name=something,
    k8s.namespace.name=$(POD_NAMESPACE),
    k8s.node.name=$(NODE_NAME),
    k8s.pod.name=$(POD_NAME),
    k8s.pod.primary_ip_address=$(POD_IP_ADDRESS),
    k8s.pod.service_account.name=$(POD_SERVICE_ACCOUNT_NAME),
    k8s.pod.uid=$(POD_UID),
    net.host.ip=$(NODE_IP_ADDRESS)
    $(_OTEL_RESOURCES_ATTRIBUTES_OVERLAY)

Note the two empty "_OTEL_RESOURCES_ATTRIBUTES_UNDERLAY" and "_OTEL_RESOURCES_ATTRIBUTES_OVERLAY" variables. In overlay kustomizations, I can populate either or both of those, but I have to be careful to terminate the former with a trailing comma and prefix the latter with a leading comma.

@dashpole
Copy link
Contributor Author

dashpole commented May 6, 2022

Thanks @seh, i've added that to the drawbacks of using dependent environment variables.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.