Experimental Semantic Convention Attribute Registration #224

dyladan · 2023-01-13T20:58:16Z

TODOs

Add lifecycle for attributes
Specify what the dictionary itself actually contains

yurishkuro

+1 overall

text/0224-experimental-semantic-conventions.md

yurishkuro · 2023-01-14T00:08:34Z

text/0224-experimental-semantic-conventions.md

+This proposal SHOULD also apply to any such components at the discretion of the TC.
+
+A procedure will be established by the technical committee (TC), or an entity designated by the TC, for registration of provisional semantic convention attributes.
+Any non-private attributes which can be registered SHOULD be registered in the provisional registry or an official OpenTelemetry semantic convention.


I am somewhat uncomfortable with the vagueness here. What, realistically, is going to be an implementation of a registry? Are we going to maintain an online database with an API? Are we going to bifurcate the "registration" process of the official and provisional attributes?

Why not just make a call right now and say that provisional attributes (which I think is a better name than "registered") are going to use the exact same mechanism of registration as the official attributes, i.e. the semantic conventions yaml file in OTEL repo. And we need to decide if it's going to be a separate repo or a separate directory. Or should this be even the same directory and just an extra label in the yam definition?

A related topic is that provisional attributes should also be subject to the OpenTelemetry Schemas versioning mechanism (which also suggests we should not bifurcate the registration process).

We already have the concept of "experimental" and "stable" semconv. It was my intention that if the TC wanted to, this OTEP could be interpreted to mean simply that the bar is drastically lowered for "experimental" semantic convention. In that interpretation there would be no difference between experimental and provisional.

I didn't want to specify that here, because I didn't know if the TC would prefer to keep the provisional registry separate as it may grow very large very quickly.

Practically, there would need to be a YAML file of sorts, the same we have for semantic conventions.

I wanted to chime on this one, although I haven't had a chance to solidify my thoughts. Effectively, such a registry of "provisional" conventions is an inevitability. The question is whether we limit this to an exclusive group of "expert committees" like we're doing for HTTP, or if we want to allow more organic growth (like what @dyladan and I propose here).

I think currently, we're using expert committees for OTEL because of the following reasons:

There's a lot of high-value, high-impact conventions for (HTTP, Databases, Message-Passing, e.g.) observability.

We CAN draw an expert committee

The cost of failure is high.

Once we whittle down the big hitters, we have the issue of the "long-tail" of observability and our ability to attract enough experts and bandwidth. A process like this allows us to rapidly expand to help o11y users.

I guess I'm suggesting I think the need for this registry is inevitable, but it could also be "not now". I.e. we may want to first ensure we have a solid and rich core offering of conventions before we solve long-tail issues.

This registry does solve the issue of "How do OpenTelemetry instrumentation authors create instrumentation that's not part of the semantic conventions". However, this registry is also, in my view, the only possible path for semconv stability and also one that relies on having semconv be widely adopted, successful standard already, with meaningful starting point (like HTTP).

I.e. my current thinking is:

We should enable this registry, after we've solidified semantic conventions for a few key areas.

We probably need a different solution for "How do OpenTelemetry instrumentation authors create instrumentation that's not part of the semantic conventions"

If the answer is "not now" we could easily simplify this to a policy change where we allow/encourage contrib and community authors to experiment with instrumentation even if it isn't covered by semantic conventions. The end result should hopefully be the same with the best solutions bubbling to the top and becoming the most popular. It would mean different instrumentations for the same module might not export data the same way, but maybe that sort of competition is healthy.

I believe the newly limited scope of the proposal to an attribute dictionary should address this concern. Please resolve this conversation if you agree.

yurishkuro · 2023-01-14T00:16:26Z

text/0224-experimental-semantic-conventions.md

@@ -0,0 +1,115 @@
+# Allowing experimental extensions to Semantic Conventions
+
+A process that allows experimental / non-stable semantic conventions to exist within stable versions and how they evolve into stable semantic conventions.


Nit: in the below comments I used the term "provisional arguments", but I think it would be better to pick a single term and use it consistently. Is there a semantic difference between provisional and experimental?

I believe the newly limited scope of the proposal to an attribute dictionary should address this concern. Please resolve this conversation if you agree.

lmolkova

Attribute registration should help with the discoverability and reuse of attributes between conventions, but it is a separate concern from semantic convention as a whole.

I suggest to reflect and clarify it in the document

text/0224-experimental-semantic-conventions.md

dyladan · 2023-07-24T16:45:47Z

I had been waiting for the spec->semconv transition to get back on this. I'll update this OTEP this week with feedback received.

dyladan · 2023-07-26T03:39:36Z

I limited the scope of the proposal to just an attribute dictionary to address concerns raised by @lmolkova and others. I believe 99% of the value of the proposal is retained simply by allowing instrumentation authors to use attributes that are not already included in a semantic convention. If it goes well, maybe the idea will be extended to other areas of semantic conventions in the future.

lmolkova · 2023-07-26T05:58:36Z

text/0224-experimental-semantic-conventions.md

+They SHOULD choose attribute keys that are descriptive, relevant, and that they have reason to believe are not currently in use or already proposed.
+Before creating a new attribute, the attribute author SHOULD review the attribute dictionary to see if an attribute already exists which fits the intended use case.
+If such a registered attribute exists, then it SHOULD be preferred over the creation of new attributes unless there is sufficient reason to create a new attribute.
+Any new attributes SHOULD be registered in the attribute dictionary before an instrumentation using the attribute is released.


is there any reason or way to control non-otel-owned instrumentations? For example, an internal customer library that instruments some technology might want to invent db.document_id - there is no harm in this and no way for us to control/enforce things.

So the proposal here is to use normative language for OpenTelemetry-authored instrumentation only. And we already have soft recommendations in the spec for everyone else - https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/common/attribute-naming.md#recommendations-for-application-developers

The idea is to prevent collisions. If some instrumentation starts emitting db.my_attribute but doesn't register it, another instrumentation may emit the same attribute. I think the blanket normative language is fine. Carving exemptions into the language for things we don't control seems unnecessary to me.

The concern here is that deprecation is not clearly specified in the document. The only thing there is

If a registered attribute is deemed by the TC to be harmful in some way, it SHOULD be marked deprecated, sufficient reasoning for the deprecation should be included in the dictionary

Documenting a list of proposed attribute lifecycle stages would resolve my concern.

lmolkova · 2023-07-26T06:00:31Z

text/0224-experimental-semantic-conventions.md

+If such a registered attribute exists, then it SHOULD be preferred over the creation of new attributes unless there is sufficient reason to create a new attribute.
+Any new attributes SHOULD be registered in the attribute dictionary before an instrumentation using the attribute is released.
+If an instrumentation uses an attribute not included in the semantic conventions it is recommended that the attribute not be collected by default unless it is fundamental to the instrumentation.
+For example, an instrumentation which instruments a component not yet addressed by semantic conventions.


Suggested change

For example, an instrumentation which instruments a component not yet addressed by semantic conventions.

It probably makes sense to turn the whole instrumentation on/off for such a component, not individual attributes, but then it goes out of the scope of this OTEP

This is meant to be an example of when you would not turn off the attribute. I'll try to rework the wording to make that more clear.

lmolkova · 2023-07-26T06:07:28Z

text/0224-experimental-semantic-conventions.md

+Some examples of reasonable grounds for refusal: frivolous use of the dictionary such as registering nonsense words or inside jokes, a credible security or privacy risk, conflict with another attribute or proposal for a new attribute, or that the proposal is sufficiently lacking in purpose, or misleading about its purpose, that it can be held to be a waste of time and effort.
+This list is non-exhaustive and final decision for refusal is at the discretion of the TC.
+
+Once an attribute is registered it MUST NOT receive any breaking changes.


We need some time to experiment with attributes before they stabilize and we already have experimental and stable maturity levels defined in the build tools.
Can we define them explicitly here and allow experimental attributes to get breaking changes?

I really don't want that. I would much prefer a new attribute to be registered with a new name if a breaking change really needs to be made. Once an attribute is "in the wild" it should be treated as stable. If we register experimental attributes we're going to end up in the same situation we are now with the semconv where they sit for years in experimental then one day get broken and users are broken.

What if instead I update the language such that an attribute in early experimentation phase should not be registered until the author is confident the format will not have a breaking change?

see my other comment below - we renamed attributes a lot in the past (and made other breaking changes to them), there is no reason to believe we would not continue this.

What if instead I update the language such that an attribute in early experimentation phase should not be registered until the author is confident the format will not have a breaking change?

then we need to allow experimental semconv to use their own, non-registered attributes.

then we need to allow experimental semconv to use their own, non-registered attributes.

My concern with that is that we'll end up with a sort of chaotic state where parts of attributes are registered and others remain with their definition in the semconv for a long time. So the dictionary is then not really a true dictionary because it will miss many attributes.

My concern with that is that we'll end up with a sort of chaotic state where parts of attributes are registered and others remain with their definition in the semconv for a long time.

That is exactly what the registry is to prevent and why the barrier to entry is so low.

then we need to allow experimental semconv to use their own, non-registered attributes.

I'd like to prevent that as much as we can.

see my other comment below - we renamed attributes a lot in the past (and made other breaking changes to them), there is no reason to believe we would not continue this.

If an attribute needs to be renamed, a new attribute can simply be registered and the old one deprecated. The new attribute is then used in the semantic convention. I'm not sure what the concern is.

The concern here is that deprecation is not clearly specified in the document. The only thing there is

If a registered attribute is deemed by the TC to be harmful in some way, it SHOULD be marked deprecated, sufficient reasoning for the deprecation should be included in the dictionary

Documenting a list of proposed attribute lifecycle stages would resolve my concern.

I'll add a lifecycle section

lmolkova · 2023-07-26T06:08:46Z

text/0224-experimental-semantic-conventions.md

+This document explicitly does NOT define a process for removal of a registered attribute from the attribute dictionary.
+Because absence of evidence does not constitute evidence of absence, there is no way to know if any registered attribute is in active use.
+If a registered attribute is deemed by the TC to be harmful in some way, it SHOULD be marked deprecated, sufficient reasoning for the deprecation should be included in the dictionary, and, if appropriate, a new attribute SHOULD be registered with a different name.
+A registered attribute MAY ONLY be removed if, at the discretion of the TC, it is determined to be harmful or offensive to the project, its contributors, users, or some other group.


I suggest applying this requirement to stable attributes only

I don't want to distinguish between stable and experimental attributes. Any attribute in use should be stable or we will be breaking users, and any registered attribute should be assumed to be in use as we have no way to prove otherwise.

based on our experience in HTTP semconv, we renamed certain attributes a couple of times before we were happy with them.

having http.method, and http.request.method, and http.request_method_canonical meaning the same thing with slight variations in definition would be confusing and harmful.

Also, based on this proposal semantic conventions cannot use attributes outside of the registry, which means that, however premature new semconv is, attributes get into the registry first, before prototyping/development of semconv is complete. It's unrealistic to expect the first version of an attribute to be the final one.

So we either need

a separate experimental dictionary

an experimental level for attributes

allow experimental semconv define their own attributes within that semconv and only move attributes into the registry once semconv is in the feature-freeze/stable

[update] or a deprecation story for renamed attributes - no new semconv/apps should use them and all semconv using them should update to a new attribute

I don't want to distinguish between stable and experimental attributes. Any attribute in use should be stable or we will be breaking users, and any registered attribute should be assumed to be in use as we have no way to prove otherwise.

I agree with @lmolkova here. Regarding breaking users by breaking attributes: how would it be different if we deprecate attributes in favor of introducing new ones? Effectively, this is a breaking change as well, because instrumentations will stop populating the deprecated attributes at some point in time (otherwise they would need to populate redundant attributes forever).

Also, as I wrote below, I think we need a way to indicate for user how much they can rely on the stability of attributes. Some attributes can be rather stable (where breaking changes are very unlikely) other are intentionally not stable as they have been created for "experimentation purposes". But, how would (non-expert) semconv users know what applies to which attribute?

That's why I'm in favor of a stability level for attributes, too.

In ECS we have a stability level per namespace and per field level. See this example.

Also, as I wrote below, I think we need a way to indicate for user how much they can rely on the stability of attributes.

Great point! It means that stable -> deprecated is not a good lifecycle for the attribute, we still need the experimentation stage to communicate to users if they can really depend on the attribute

I agree with @lmolkova here. Regarding breaking users by breaking attributes: how would it be different if we deprecate attributes in favor of introducing new ones? Effectively, this is a breaking change as well, because instrumentations will stop populating the deprecated attributes at some point in time (otherwise they would need to populate redundant attributes forever).

It is different because when you are interpreting your data you can still look up an old attribute in the dictionary and find out its meaning. If the definition is changed or it is renamed without keeping the old name around somewhere, this information is simply lost. The instrumentation changing to use the new attribute is a break yes, but the attribute itself should remain stable even if it is going out of use.

I think an attribute-specific experimental phase is reasonable, but I think it would be better if previous versions were kept around in some way so that older data can be interpreted. What would you think if we had an attribute-specific version? Anyone can register an experimental 0.x attribute, but only semconv maintainers can promote attributes to 1.x?

to clarify: could the experimental attributes be removed or deprecated when they are renamed?

E.g. we used to have http.method and renamed http.request.method. http.method should stay so that code generation does not break code that relies on it, but we want everyone to now to use http.request.method. Marking http.method as deprecated would be a good way to communicate to users and semconv authors that they should not use it.

yes that is the idea. if that isn't clear from the text I can update it?

AlexanderWert

LGTM.

@dyladan WDYT about including a proposal on the process HOW we want to achieve that (similar to what I started outlining in this comment)?

AlexanderWert · 2023-07-26T13:09:39Z

text/0224-experimental-attrribute-registration.md

+
+A semantic convention attribute dictionary will be established which is maintained in a separate file or directory from the semantic conventions, seeded with all existing semantic convention attributes.
+All semantic conventions MUST refer to the attribute dictionary for all attributes used by that semantic convention.
+Semantic conventions MUST NOT establish an attribute not in the attribute dictionary without explicit approval of the TC with the reasoning documented directly in that convention.


How about we even forbid it technically (e.g. by adopting the yaml syntax for defining vs. using attributes in semantic conventions)?

AlexanderWert · 2023-07-27T06:21:11Z

text/0224-experimental-attrribute-registration.md

+Some examples of reasonable grounds for refusal: frivolous use of the dictionary such as registering nonsense words or inside jokes, a credible security or privacy risk, conflict with another attribute or proposal for a new attribute, or that the proposal is sufficiently lacking in purpose, or misleading about its purpose, that it can be held to be a waste of time and effort.
+This list is non-exhaustive and final decision for refusal is at the discretion of the TC.
+
+Once an attribute is registered it MUST NOT receive any breaking changes.


Isn't the opposite the goal here? The motivation states that with this proposal we want to make it easier to experiment. But, with forbidding breaking changes on attributes registered in the dictionary we wouldn't it have the opposite effect?

At the same time I see the need for users of the attributes (e.g. semconv authors, instrumentation authors, etc.) to know how much they can rely on the stability of attributes they would use.

How about having stability levels on attributes (similar to what we have with semantic conventions today)?
This is also how we handle it in ECS.

I want instrumentation authors to easily answer the question "how do I capture this piece of data that isn't covered by semconv?" As JS maintainer it is one of the most common questions I get and we are constantly being asked to add attributes which are not in semconv. Those users need telemetry stability just as much as any other. I consider their entries to be experiments, but if they are failed experiments it doesn't mean the attribute should go away or be broken, but a new attribute created taking into account the learnings of experience.

If you feel your issue is resolved can you please mark it resolved (or if you don't have permission just say "resolved" so I can).

AlexanderWert · 2023-07-27T06:24:13Z

text/0224-experimental-attrribute-registration.md

+Because absence of evidence does not constitute evidence of absence, there is no way to know if any registered attribute is in active use.
+If a registered attribute is deemed by the TC to be harmful in some way, it SHOULD be marked deprecated, sufficient reasoning for the deprecation should be included in the dictionary, and, if appropriate, a new attribute SHOULD be registered with a different name.
+A registered attribute MAY ONLY be removed if, at the discretion of the TC, it is determined to be harmful or offensive to the project, its contributors, users, or some other group.
+For example, if an attribute `http.headers` is registered and it is later determined by the TC to be harmful, it should be deprecated and a new attribute such as `http.message_headers` or similar created to replace it.


OK, I see, this is sort of the answer to my question above.

pyohannes · 2023-07-27T09:28:49Z

text/0224-experimental-attrribute-registration.md

+
+## Definitions
+
+**Attribute Dictionary**: The attribute dictionary contains all semantic convention attributes, including their names and value definitions.


Currently, besides the name and value definitions, semantic conventions define a description, notes, examples, and requirement levels for attributes.

Is it intended that any of those are part of the dictionary? Or is this some enrichment that takes place in the semantic conventions?

I think the attribute definition in the dictionary should cover the following:

name (required)

type (required)

description (required)

examples (required)

notes (optional)

stability (depending on what we decide on the above discussion)

requirement_level, sampling relevance and tag must not be part of the definition but will be specified on the usage of the attributes. And usages should be abled to overwrite all of the above (except for name and type)

Makes sense.

I think the OTEP should me more specific about this and give a definition of what the dictionary contains. It's important for understanding and working out how the attribute dictionary and semantic conventions interact, which I think is crucial.

i'll update the otep to add more specificity

I'm going to add this but I'm going to make it a "suggested" set of fields. If there is some concern during implementation that comes up I don't want the OTEP to disallow some flexibility to solve problems.

text/0224-experimental-attrribute-registration.md

AlexanderWert · 2023-08-10T06:27:55Z

@dyladan Here's a draft PR, proposing initial thoughts on how the yaml syntax for attribute definitions could look like.

open-telemetry/build-tools#191

I'd appreciate your thoughts on that! Maybe we could even add this / similar yaml syntax spec as a potential example how it could look like into this proposal?

dyladan · 2023-11-02T12:07:05Z

After discussion with @jsuereth we have decided to close this OTEP. A lot of what this OTEP was asking for is already implemented. Specifically, the Attributes Registry replaces the Attribute Dictionary mechanism described here.

We still believe that the core issue of lowering the barrier to create a new entry in the Attribute Dictionary is important, but believe that this OTEP is no longer the way to move that idea forward.

dyladan requested a review from a team January 13, 2023 20:58

Propose procedure for experimental semantic conventions

2eec904

dyladan force-pushed the experimental-semconv branch from 4f3bb45 to 2eec904 Compare January 13, 2023 20:58

yurishkuro reviewed Jan 14, 2023

View reviewed changes

arminru requested review from a team January 16, 2023 18:44

lmolkova requested changes Jan 26, 2023

View reviewed changes

text/0224-experimental-semantic-conventions.md Outdated Show resolved Hide resolved

text/0224-experimental-semantic-conventions.md Outdated Show resolved Hide resolved

text/0224-experimental-semantic-conventions.md Outdated Show resolved Hide resolved

tedsuo added priority:p0 priority:p1 area:sem-conv triaged and removed priority:p0 labels Jan 30, 2023

dyladan mentioned this pull request Jul 25, 2023

Proposal: Decoupling Attribute definition from Attribute usage in models open-telemetry/semantic-conventions#197

Open

dyladan added 2 commits July 25, 2023 23:34

Limit scope to attributes

71fce3b

Merge remote-tracking branch 'origin/main' into experimental-semconv

c988af0

lmolkova reviewed Jul 26, 2023

View reviewed changes

dyladan changed the title ~~Propose procedure for experimental semantic conventions~~ Experimental Semantic Convention Attribute Registration Jul 26, 2023

Rename to match new title

aee2069

AlexanderWert approved these changes Jul 27, 2023

View reviewed changes

pyohannes reviewed Jul 27, 2023

View reviewed changes

dyladan closed this Nov 2, 2023

JamieDanielson mentioned this pull request Jul 9, 2024

Should library-specific attributes be added to semantic conventions? open-telemetry/semantic-conventions#1218

Open

AlexanderWert mentioned this pull request Sep 3, 2024

Group yaml files by root namespace instead of signal open-telemetry/semantic-conventions#1345

Merged

4 tasks

		@@ -0,0 +1,115 @@
		# Allowing experimental extensions to Semantic Conventions

		A process that allows experimental / non-stable semantic conventions to exist within stable versions and how they evolve into stable semantic conventions.


		## Definitions

		Attribute Dictionary: The attribute dictionary contains all semantic convention attributes, including their names and value definitions.

Experimental Semantic Convention Attribute Registration #224

Experimental Semantic Convention Attribute Registration #224

Conversation

dyladan commented Jan 13, 2023 • edited Loading

yurishkuro left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lmolkova left a comment

Choose a reason for hiding this comment

dyladan commented Jul 24, 2023

dyladan commented Jul 26, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lmolkova Aug 10, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lmolkova Jul 26, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lmolkova Aug 10, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lmolkova Jul 26, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dyladan Aug 8, 2023 • edited Loading

Choose a reason for hiding this comment

lmolkova Aug 9, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

AlexanderWert left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

AlexanderWert Jul 27, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

AlexanderWert commented Aug 10, 2023 • edited Loading

dyladan commented Nov 2, 2023

dyladan commented Jan 13, 2023 •

edited

Loading

lmolkova Aug 10, 2023 •

edited

Loading

lmolkova Jul 26, 2023 •

edited

Loading

lmolkova Aug 10, 2023 •

edited

Loading

lmolkova Jul 26, 2023 •

edited

Loading

dyladan Aug 8, 2023 •

edited

Loading

lmolkova Aug 9, 2023 •

edited

Loading

AlexanderWert left a comment •

edited

Loading

AlexanderWert Jul 27, 2023 •

edited

Loading

AlexanderWert commented Aug 10, 2023 •

edited

Loading