Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use weaver to generate latest semconv 1.27 #4690

Merged
merged 58 commits into from
Aug 7, 2024

Conversation

dyladan
Copy link
Member

@dyladan dyladan commented May 9, 2024

This is a big PR but most of it is autogenerated. Below is a list of changes:

  • Update semconv to 1.27
  • Update to weaver 0.8.0
  • Output experimental and stable separately so we can export separately
    • experimental attributes and metrics now have @experimental jsdoc tag
    • main export contains only stable attribute and metric names
    • /incubating export contains both stable and experimental names
  • Change SEMRESATTRS_ and RESATTRS_ to just ATTR_ for attributes
  • Generate enum values as <attribute name>_VALUE_<enum value name>
    • example: NETWORK_TRANSPORT_VALUE_TCP is the value tcp for enum ATTR_NETWORK_TRANSPORT
  • Generate constants for metric names with METRIC_ prefix
  • Deprecate all old names. These files will never change again and be removed in 2.0 if we ever release one
  • All names are constants now. Removes requirement for all the weird type stuff (sorry @MSNev I know you spent a lot of time on that)

Notes:

  • Template attributes are exported as functions. For example some.attribute.<key> is exported as a function that takes key: string and returns string, ATTR_SOME_ATTRIBUTE('my-key') => 'some.attribute.my-key'.
    • No validation is done on input
    • No transformation or normalization is done on input

Copy link
Member Author

@dyladan dyladan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added some quick comments to explain some of my thought process

scripts/semconv/templates/SemanticAttributes.ts.j2 Outdated Show resolved Hide resolved
scripts/semconv/generate.sh Outdated Show resolved Hide resolved
scripts/semconv/generate.sh Outdated Show resolved Hide resolved
scripts/semconv/.gitignore Show resolved Hide resolved
@dyladan
Copy link
Member Author

dyladan commented May 9, 2024

/cc @trentm @JamieDanielson since you seemed interested in this

/cc @MSNev since you have done the most work on this recently

@dyladan dyladan marked this pull request as ready for review May 9, 2024 16:31
@dyladan dyladan requested a review from a team May 9, 2024 16:31
@trentm
Copy link
Contributor

trentm commented May 9, 2024

I'll take a while to review this. I'm still trying to grok the generation, the semantic-conventions/model vs schemas/... subdirs, etc. Some early Qs/thoughts:

  • I gather merging the SEMRESATTRS_ and SEMATTRS_ groups is related to the "Problem" described at Reconsidering what semantic conventions code generation should produce semantic-conventions#551 A hearty +1 to not using those prefixes. Did you consider also dropping the "ATTRS_" prefix? IIUC the Go semconv does not have any prefix on the exports from its semconv package. Java has namespacing of a different sort via the HttpAttributes part of import io.opentelemetry.semconv.HttpAttributes.
  • Similar to above, did you consider not having the METRIC_ prefix on metrics-related constants? (I don't see metrics-related values in open-telemetry/semantic-conventions-java.git and I'm not sure why. Does OTel Java not publish a package with metrics semconv constants?)

Correctness Qs:

  • Are you sure that the "deprecated" dirs in "semantic-conventions/model/..." handle all the deprecated values? For example http.resend_count was renamed to http.request.resend_count, but with your PR there is no deprecated HTTP_RESEND_COUNT entry.
  • http.client_ip is deprecated. There is a SEMATTRS_HTTP_CLIENT_IP but no ATTR_HTTP_CLIENT_IP, even though:
* @deprecated use ATTR_HTTP_CLIENT_IP
*/
export const SEMATTRS_HTTP_CLIENT_IP = TMP_HTTP_CLIENT_IP;

Same for SEMATTRS_DB_CASSANDRA_KEYSPACE, and I assume for others.

@trentm
Copy link
Contributor

trentm commented May 9, 2024

The _VALUES_ fields are using the description of the field for which they are values as their comment, e.g.:

/**
 * The language of the telemetry SDK.
 */
export const TELEMETRY_SDK_LANGUAGE_VALUES_CPP = 'cpp';

/**
 * The language of the telemetry SDK.
 */
export const TELEMETRY_SDK_LANGUAGE_VALUES_DOTNET = 'dotnet';

/**
 * The language of the telemetry SDK.
 */
export const TELEMETRY_SDK_LANGUAGE_VALUES_ERLANG = 'erlang';

/**
 * The language of the telemetry SDK.
 */
export const TELEMETRY_SDK_LANGUAGE_VALUES_GO = 'go';

Can the description (is that the "brief" yaml field?) of the value be used, instead?

@dyladan
Copy link
Member Author

dyladan commented May 10, 2024

Did you consider also dropping the "ATTRS_" prefix? ... Similar to above, did you consider not having the METRIC_ prefix

Yes I did consider that and I still would consider it if we want to go that route. It is my understanding that the semconv has decided to use a registry of unique attributes that can be applied to any signal or resource so there is no reason to differentiate them. I only kept the ATTR and METRIC prefix just to make it easier to find the value you want when autocompleting and not get confused.

Are you sure that the "deprecated" dirs in "semantic-conventions/model/..." handle all the deprecated values? For example http.resend_count was renamed to http.request.resend_count, but with your PR there is no deprecated HTTP_RESEND_COUNT entry.

I'm actually sure they're NOT all there. The deprecated.yaml didn't exist when many of these were removed and they weren't all added back. I left all the old versions in the file they were already in, so it isn't a breaking change, but I am going to add the missing attributes to the registry anyway (see open-telemetry/semantic-conventions#1025)

Can the description (is that the "brief" yaml field?) of the value be used, instead?

Good catch. I'll update the PR

@dyladan
Copy link
Member Author

dyladan commented May 10, 2024

Can the description (is that the "brief" yaml field?) of the value be used, instead?

Good catch. I'll update the PR

Unfortunately it looks like there aren't actually descriptions on the values themselves. I think the intellisense autocomplete looks ok anyway though:

image

@dyladan
Copy link
Member Author

dyladan commented May 10, 2024

@trentm what about this?

image

@dyladan
Copy link
Member Author

dyladan commented May 10, 2024

I ended up with something like this:

/**
 * Enum value 'created' for attribute {@link ATTR_ANDROID_STATE}.
 *
 * @experimental this attribute is experimental and is subject to change in minor releases of `@opentelemetry/semantic-conventions`.
 */
export const ANDROID_STATE_VALUES_CREATED = 'created';

Which looks like this and actually links back to its parent attribute

image

Copy link

codecov bot commented May 13, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 91.04%. Comparing base (ecc88a3) to head (9bd9802).
Report is 25 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #4690   +/-   ##
=======================================
  Coverage   91.04%   91.04%           
=======================================
  Files          89       89           
  Lines        1954     1954           
  Branches      416      416           
=======================================
  Hits         1779     1779           
  Misses        175      175           

Copy link
Member

@JamieDanielson JamieDanielson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So far this is looking really great, thanks @dyladan and thanks @trentm for the review so far.

I like ATTR better than SEMATTRS and SEMRESATTRS and wish I had realized this sooner and commented on the original PR that introduced them. I'm not clear what the full benefit is of having those other prefixes, although it may have been more relevant before they were in the global registry.

I only kept the ATTR and METRIC prefix just to make it easier to find the value you want when autocompleting and not get confused.

I'm not sure I understand the value of having ATTR prefix for attributes but no prefix for values. In that case I'd think they could be prefixed as well, or prefix neither.

@JamieDanielson
Copy link
Member

For the main export import {} from '@opentelemetry/semantic-conventions' should ALL semconv be exported experimental and stable, or should only the stable be exported and experimental would be imported from @opentelemetry/semantic-conventions/experimental?

🤔 The benefit of keeping experimental attributes in /experimental subdirectory is that we are making it very explicit that it is an experimental attribute. I guess the downside is when the experimental attribute becomes stable, the consumer of that would have to update their code when they upgrade packages?

@dyladan
Copy link
Member Author

dyladan commented May 13, 2024

So far this is looking really great, thanks @dyladan and thanks @trentm for the review so far.

I like ATTR better than SEMATTRS and SEMRESATTRS and wish I had realized this sooner and commented on the original PR that introduced them. I'm not clear what the full benefit is of having those other prefixes, although it may have been more relevant before they were in the global registry.

Exactly. Previously there was some chance (although probably it wouldn't have happened) that the same attribute could have been defined for different signals. I think the most reasonable way this could have happened would be for an attribute to have bounded specific values for metrics to control cardinality, but be unbounded for other signals or resources.

I only kept the ATTR and METRIC prefix just to make it easier to find the value you want when autocompleting and not get confused.

I'm not sure I understand the value of having ATTR prefix for attributes but no prefix for values. In that case I'd think they could be prefixed as well, or prefix neither.

The way we have it in this PR values have a postfix (actually an infix between the enum name and the value name). It provides separation between the enum name and the value name so it is distinguishable easily. For example, HOST_TYPE_LINUX is less obvious to me than HOST_TYPE_VALUE_LINUX where it is clear that LINUX is the value for the HOST_TYPE enum (these are fake attributes I just made up to prove a point).

@JamieDanielson
Copy link
Member

For the main export import {} from '@opentelemetry/semantic-conventions' should ALL semconv be exported experimental and stable, or should only the stable be exported and experimental would be imported from @opentelemetry/semantic-conventions/experimental?

🤔 The benefit of keeping experimental attributes in /experimental subdirectory is that we are making it very explicit that it is an experimental attribute. I guess the downside is when the experimental attribute becomes stable, the consumer of that would have to update their code when they upgrade packages?

I guess Java has a separate package for experimental attributes - there's a semconv in instrumentation-api, and a semconv in instrumentation-api-incubator. Python also has semconv in incubating separate from semconv stable. Go seems to have it all in one.

@dyladan
Copy link
Member Author

dyladan commented May 13, 2024

🤔 The benefit of keeping experimental attributes in /experimental subdirectory is that we are making it very explicit that it is an experimental attribute. I guess the downside is when the experimental attribute becomes stable, the consumer of that would have to update their code when they upgrade packages?

This is the definition of experimental... It also would force users to at least consider if they need to make a change. If the semconv attributes you're using change it might be good to force our users to acknowledge that by changing to the stable export. If they can get all from a single export they may never notice if something is renamed/deprecated.

I guess Java has a separate package for experimental attributes - there's a semconv in instrumentation-api, and a semconv in instrumentation-api-incubator. Python also has semconv in incubating separate from semconv stable. Go seems to have it all in one.

I think a single package with multiple entry points is roughly equivalent to having separate packages and less overhead. Go has all in one but they export each version separately so you have to do something to get the new semconv version.

@trentm
Copy link
Contributor

trentm commented May 13, 2024

I guess Java has a separate package for experimental attributes

My understanding of the OTel Java team's recommendations/requirements is that they do not allow a stable instrumentation package to have a dependency on the instrumentation-api-incubating package. They instead suggest the instrumentation have a copy of the experimental attributes in its own package code. This means that a user of the (non-experimental) semconv package is never broken by a semver-minor update of the package.

I guess we could get the equivalent by either (a) never using the "../experimental" entry point in stable instrumentation packages, or (b) pinning the @opentelemetry/semantic-conventions dep to a particular minor in packages that do.

I think a single package with multiple entry points is roughly equivalent to having separate packages and less overhead.

Agreed.

Go has all in one but they export each version separately so you have to do something to get the new semconv version.

This PR beat me to an attempt to update the semconv package. FWIW, I had been considering having separate entry points for each semconv version. See #4572 (comment)
I'm not advocating that option over this PR, however.

@trentm
Copy link
Contributor

trentm commented May 13, 2024

I ended up with something like this: [screenshot of intellisense for a _VALUES_ field]

Nice. That looks good.

I only kept the ATTR and METRIC prefix just to make it easier to find the value you want when autocompleting and not get confused.

My soft vote is for no prefixes. The way I thinking/expecting developers to use semconv values was to (a) have a semantic-conventions document open (e.g. https://opentelemetry.io/docs/specs/semconv/http/http-metrics/) and see a string (e.g. http.server.request.duration) and (b) then want to be able to import HTTP_SERVER_REQUEST_DURATION.

IIUC, autocomplete will show ATTR_HTTP_* and METRIC_HTTP_* values when typing HTTP so I think it is fine for autocomplete either way. Having the METRIC_ does help the developer that knows they are scoped to metrics stuff. ATTR_ feels out of place for non-metrics, non-logs stuff.

Another small reason is that I like the shorter names in code.

This is a soft vote though. I don't have a very strong reaction to ATTR_.

@trentm
Copy link
Contributor

trentm commented May 13, 2024

The way we have it in this PR values have a postfix (actually an infix between the enum name and the value name)

I like the _VALUES_ infix.
I'm not sure if reads better as _VALUE_ (singular).

Copy link
Member

@pichlermarc pichlermarc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great. Thank you for working through all this. 🙏
I installed this package into a working copy of the contrib repo and everything still works 🙂

scripts/semconv/templates/registry/stable/docstring.ts.j2 Outdated Show resolved Hide resolved
@pichlermarc pichlermarc added pkg:semantic-conventions target:next-minor-release This PR is in scope for the next minor release (`main` branch) labels Aug 7, 2024
@dyladan dyladan added this pull request to the merge queue Aug 7, 2024
Merged via the queue into open-telemetry:main with commit 01cea7c Aug 7, 2024
19 checks passed
@dyladan dyladan deleted the semconv-1.25 branch August 7, 2024 12:46
Zirak pushed a commit to Zirak/opentelemetry-js that referenced this pull request Sep 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pkg:semantic-conventions target:next-minor-release This PR is in scope for the next minor release (`main` branch)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants