Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add semantic conventions for instrumenting AWS Lambda. #1442

Merged
merged 25 commits into from
Mar 24, 2021

Conversation

anuraaga
Copy link
Contributor

@anuraaga anuraaga commented Feb 16, 2021

Changes

Provides direction for instrumenting AWS Lambda request handlers to generate faas spans. There aren't any new attributes defined, just explanation of how to fill them as well as concerns about parenting, etc. This is mostly based on the Java implementation but is in no ways set in stone.

/cc @kubawach @wangzlei

@anuraaga anuraaga requested review from a team February 16, 2021 07:05
examples:
- pubsub
- ref: messaging.system
brief: "MUST be `AmazonSQS`."
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This pattern seems commonly useful enough that special support in the semantic convention generator would be nice as in fixed_value: "AmazonSQS".

The parent of the span MUST be determined by considering both the environment and any headers or attributes
available from the event.

If the `_X_AMZN_TRACE_ID` environment variable is set, it SHOULD be parsed into an OpenTelemetry `Context` using
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems to me that the application owner should be able to decide this. Ideally, the X-Ray propagator (or a specialized LambdaXrayPropagator) would be written in such a way that it checks the environment variable itself. The instrumentation should not decide this.

Copy link
Contributor Author

@anuraaga anuraaga Feb 16, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The user can currently decide by enabling or disabling XRay so I think this reflects that choice well. If we had another option in the instrumentation it's two settings related to XRay which I think is just an extra setting.

It reminds me that in a follow-up to the SDK PR I need to describe using the AWS propagator directly on the HTTP calls as we do in Java. It's the only recognized format for the next years so there is no point for an option at least until that changes it's just an extra option. I think it's similar for Lambda.

semantic_conventions/trace/instrumentation/aws-lambda.yaml Outdated Show resolved Hide resolved
semantic_conventions/trace/instrumentation/aws-lambda.yaml Outdated Show resolved Hide resolved
semantic_conventions/trace/instrumentation/aws-lambda.yaml Outdated Show resolved Hide resolved
@@ -0,0 +1,104 @@
# Instrumenting AWS Lambda
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This document doesn't mention its relationship to the FaaS spec, https://github.com/open-telemetry/opentelemetry-specification/blob/main/semantic_conventions/trace/faas.yaml. Is it extending it or overriding it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't the first paragraph clarify this, including an actual link to faas? Let me know if something's unclear.

brief: "The value of the AWS Request ID from the Lambda `Context`."
examples:
- 943ad105-7543-11e6-a9ac-65e093327849
- ref: faas.id
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the attributes here are not Lambda specific, can we extend the https://github.com/open-telemetry/opentelemetry-specification/blob/main/semantic_conventions/trace/faas.yaml instead? If not, maybe we should namespace them differently. For example, "aws.lambda.id", "aws.lambda.execution", etc.

Copy link
Member

@Oberon00 Oberon00 Feb 16, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this was the intention of @anuraaga. FaaS conventions IMHO sufficiently describe the common lambda attributes (e.g. it tells you that the ARN is the faas.id in the faas' note already). But this document describes in more detail how to apply existing semantic conventions to Lambda. So it is IMHO right to refer to (note that ref is used here instead of introducing a new attribute with id) the existing, fitting attributes.

Copy link
Contributor Author

@anuraaga anuraaga left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, updated

@@ -0,0 +1,104 @@
# Instrumenting AWS Lambda
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't the first paragraph clarify this, including an actual link to faas? Let me know if something's unclear.

@anuraaga
Copy link
Contributor Author

anuraaga commented Mar 1, 2021

Can anyone give this another look? Thanks!

@Oberon00 Oberon00 added area:semantic-conventions Related to semantic conventions spec:trace Related to the specification/trace directory spec:resource Related to the specification/resource directory labels Mar 1, 2021
@Oberon00
Copy link
Member

Oberon00 commented Mar 1, 2021

I wonder if using the semantic convention generator here is more confusing than useful. Just writing a plain markdown file and explaining (without listing every single attribute) which conventions to use in which combination for Lambda could be enough IMHO.

@anuraaga
Copy link
Contributor Author

anuraaga commented Mar 3, 2021

@Oberon00 Tried inlining into the markdown, how does it look?

Copy link
Contributor

@jsuereth jsuereth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Non-Blocking comments:

I love the idea instrumentation conventions!

  1. Can you provide a link to the java implementation in the PR header so it shows up in the commit record?
  2. I see the implementation has "agent" and "library" usage. Do you envision "library" being ubiquitously provided across OTel languages and "agent' when applicable?

Asking for two reasons:

  • Likely Cloud Run / CloudFunctions should look very similar to what you've done here (baring the nuance/details of our environment).
  • I think you may be paving the way for how the "instrumentation" work @tedsuo calls out in the OTel timeline can make progress.

Copy link
Contributor Author

@anuraaga anuraaga left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see the implementation has "agent" and "library" usage. Do you envision "library" being ubiquitously provided across OTel languages and "agent' when applicable?

Yup I think this is what we'd expect. For Java, wherever possible we are using the "agent" module only to initialize the "library" module but not itself define any instrumentation. Most languages won't have the concept of an agent though, these conventions definitely refer to library instrumentation.

I think you may be paving the way for how the "instrumentation" work @tedsuo calls out in the OTel timeline can make progress.

Cool - yeah I'm thinking we need these instrumentation "manuals" or else languages will end up diverging, losing a lot of the value of these specc'd out conventions. Some libraries are pretty easy, thinking HTTP clients like Okhttp, and may not need manuals, but many have corner cases to deal with. And very much hoping to avoid this...

@anuraaga
Copy link
Contributor Author

anuraaga commented Mar 9, 2021

@open-telemetry/technical-committee Any suggestions on how to proceed to get this merged?

@arminru
Copy link
Member

arminru commented Mar 11, 2021

@anuraaga It would be great if you could get a ✔️ from one of our other AWS experts like @rakyll and @alolita, so that the @open-telemetry/specs-approvers know everything is sound from your side. Also please make sure that all conversation are resolved, there are quite a few still open. Thanks!

@anuraaga
Copy link
Contributor Author

@arminru Also added some examples

Copy link
Member

@Oberon00 Oberon00 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, that makes sense. It is annoying that we are not able to safely transport arbitrary key-value headers (e.g. standard W3C traceparent), but since that is an issue of the AWS platform I won't blame these semantic conventions.

@anuraaga
Copy link
Contributor Author

Resolved merge conflicts :)

Copy link
Member

@arminru arminru left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, @anuraaga!

Copy link
Contributor Author

@anuraaga anuraaga left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @arminru updated

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:semantic-conventions Related to semantic conventions spec:resource Related to the specification/resource directory spec:trace Related to the specification/trace directory
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants