Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AWS SDK Instrumentation should use AWS SDK values for rpc attributes #160

Closed
NathanielRN opened this issue Jul 27, 2021 · 14 comments
Closed

Comments

@NathanielRN
Copy link
Contributor

Description

In the aws sdk instrumentation package, rpc attributes are set from the normalizedRequest.

export const extractAttributesFromNormalizedRequest = (normalizedRequest: NormalizedRequest): SpanAttributes => {
return {
[SemanticAttributes.RPC_SYSTEM]: 'aws-api',
[SemanticAttributes.RPC_METHOD]: normalizedRequest.commandName,
[SemanticAttributes.RPC_SERVICE]: normalizedRequest.serviceName,
[AttributeNames.AWS_REGION]: normalizedRequest.region,
};
};

normalizedRequest (v2 and v3) modify the request metadata values returned from the AWS SDK like this:

export const normalizeV2Request = (awsV2Request: Request<any, any>): NormalizedRequest => {
const service = (awsV2Request as any)?.service;
return {
serviceName: service?.serviceIdentifier?.toLowerCase(),
commandName: toCamelCase((awsV2Request as any).operation),
commandInput: (awsV2Request as any).params,
region: service?.config?.region,
};
};

Specifically:

  • It converts the service name .toLowerCase()
  • It converts the operation name .toCamelCase()

However, the specs for AWS SDK instrumentation give examples that contradict these patterns.

- ref: rpc.service
brief: "The name of the service to which a request is made, as returned by the AWS SDK."
examples:
    - DynamoDB
    - S3
- ref: rpc.method
brief: "The name of the operation corresponding to the request, as returned by the AWS SDK"
examples:
    - GetItem
    - PutItem

Action Items

Based on the examples, we should return S3 instead of s3 and ListBuckets instead of listBuckets (for example).

This means using toUpperCase() and a modified version of toCamelCase() with the first character Upper Case.

@blumamir
Copy link
Contributor

You are right
Appreciate the research and the time you invested to document this issue.

This means using toUpperCase()

toUpperCase will probably work only S3 but not for DynamoDB.
The SDK has few possible names for the same service. For example in dynamo in aws-sdk v2:

  "metadata": {
    "apiVersion": "2011-12-05",
    "endpointPrefix": "dynamodb",
    "jsonVersion": "1.0",
    "protocol": "json",
    "serviceAbbreviation": "DynamoDB",
    "serviceFullName": "Amazon DynamoDB",
    "serviceId": "DynamoDB",
    "signatureVersion": "v4",
    "targetPrefix": "DynamoDB_20111205",
    "uid": "dynamodb-2011-12-05"
  },

Looks like in v2 we can use the serviceId, but not sure if this rule will work for all the services. For example, for Route53, this value equals: "serviceId": "Route 53", (with a space). For v3 we sometimes need to extract it from the client name here due to some technical difficulties in how the code is built.

Do you know if serviceId is something that is well defined in AWS? If so, maybe the spec can be changed to state that this value should be used for rpc.service. Current wording is not very clear IMO: "The name of the service to which a request is made, as returned by the AWS SDK.".

What do you think?

@blumamir
Copy link
Contributor

There is a conversation about this issue exactly in the PR which added it to the spec, it looks like we should use the serviceIdentifier for rcp.service.

Copying the conversation here for reference:

Oberon00 on Feb 24 Member

as returned by the AWS SDK

That seems a bit strange to me. Is the name really returned by the SDK? Also, do we need some normalization here across different languages (e.g. Python might call it create_key, Java createKey and the AWS HTTP API reference CreateKey)?

anuraaga on Feb 24 •
edited
Author Member

Hmm - so the intent of my wording here was precisely to avoid the normalization. I think all AWS SDKs (at least should) return in a structured way the AWS service name and operation name, which correspond to the actual HTTP API docs and are of fixed case, no need to normalize.

But as with everything, I found one language out of the four I just checked where maybe not :(

Java
https://github.com/aws/aws-xray-sdk-java/blob/master/aws-xray-recorder-sdk-aws-sdk-v2/src/main/java/com/amazonaws/xray/interceptors/TracingInterceptor.java#L123
SERVICE_NAME, OPERATION_NAME

Go
https://github.com/aws/aws-xray-sdk-go/blob/master/xray/aws.go#L351
ServiceName, Operation.Name

JS
https://github.com/aws/aws-xray-sdk-node/blob/master/packages/core/lib/patchers/aws_p.js#L35
https://github.com/aws/aws-xray-sdk-node/blob/master/packages/core/lib/patchers/aws_p.js#L63
serviceIdentifier, operation

Python
https://github.com/aws/aws-xray-sdk-python/blob/master/aws_xray_sdk/ext/botocore/patch.py#L31
https://github.com/aws/aws-xray-sdk-python/blob/master/aws_xray_sdk/ext/boto_utils.py#L42
endpointPrefix, operation_name (or just positional argument for some cases)

Given there does seem to be reasonable precedence for calling service / operation I think what I have here may be ok, but open to any suggestion.

👍 1

Oberon00 on Feb 24 Member

OK, so there is runtime information about the called service name/operation, I didn't know that. Then this seems fine to me.

@NathanielRN NathanielRN changed the title AWS SDK Instrumentation should use AWS SDK valus for rpc attributes AWS SDK Instrumentation should use AWS SDK values for rpc attributes Jul 28, 2021
@NathanielRN
Copy link
Contributor Author

Thank you for your thoughtful reply! 😄 Thanks for posting that thread, yes I think using serviceId should be the right way to go.

And then for operation we can just switch the first letter to be toUpperCase()? I didn't see it in your metadata json blob and I don't see it in the awsV2Request when I print it out:

const service = (awsV2Request as any)?.service;

But hopefully that small change wouldn't break anyone using this instrumentation 🙂

@NathanielRN
Copy link
Contributor Author

@blumamir I made a pull request to update these rpc values, please let me know once you have a chance to look at it! 😄

@Oberon00
Copy link

Oberon00 commented Aug 4, 2021

And then for operation we can just switch the first letter to be toUpperCase()?

According to the cited thread you should be able to retrieve the operation name from the metadata. I would create a spec issue, asking for clarification. My gut feeling is that in doubt you should rather report what you have as-is than doing any string manipulation.

@blumamir
Copy link
Contributor

blumamir commented Aug 4, 2021

@NathanielRN thanks, I commented on the PR. just fix the tests and I think it's good to go

@blumamir
Copy link
Contributor

blumamir commented Aug 4, 2021

According to the cited thread you should be able to retrieve the operation name from the metadata. I would create a spec issue, asking for clarification. My gut feeling is that in doubt you should rather report what you have as-is than doing any string manipulation.

Thank you @Oberon00 for taking the time to read and comment here. I agree with your comment in general. Unfortunately, for aws-sdk js the situation is very complex:

in v2, the operation name is set with PascalCase in the apis (see example) but then later on changed to camelCase somewhere in the code which is what is accessible to the patching code.
in v3, both the service name and the method are not accessible at all! there is simply no way to retrieve them other than extracting them from the class name or digging into some very deep private attributes which are not structured or documented.

I think that the sentence "as returned by the AWS SDK" is simply not valid for JS sdk specifically.

Since the spec is (deliberately?) blurry about what value should be set for these attributes, I'm ok with applying common sense and produce values (i.e PascalCase) that are consistent between v2, v3, the generic API reference, and the examples in otel spec. However, I will be happy if someone wants to take this issue, research all the complexity, and lead a discussion in the spec repo or anywhere else. I can help with such effort but it's too time-consuming for me to lead it myself.

@Oberon00
Copy link

Oberon00 commented Aug 4, 2021

@anuraaga please take a look, it seems like the spec could use some improvement in that area.

@NathanielRN
Copy link
Contributor Author

Maybe instead of saying "as returned by the AWS SDK", the spec can explicitly point to a table of the expected service names? That way the spec does not have to tell the SDKs how to set that value, only what value it needs to set.

However I don't think I would suggest that for operation name, because that table would be much, much, larger.

@blumamir
Copy link
Contributor

blumamir commented Aug 4, 2021

However I don't think I would suggest that for operation name, because that table would be much, much, larger.

The spec can state the values should always be PascalCase to match the way they are documented in the api reference

@anuraaga
Copy link

anuraaga commented Aug 5, 2021

Recently there has been convergence of AWS API definitions on Smithy. I think the field in the Smithy model that matches what we want here is the cloudFormationName

https://awslabs.github.io/smithy/1.0/spec/aws/aws-core.html#cloudformationname

It's generally the pascal space without spaces I think.

I think we at AWS need to start an internal discussion with the AWS SDK team to ensure this field is available in any smithy-based SDK (JS v3, Go v2, Rust currently) at the least, for use in observability.

So far I haven't found an authoritative source for the models, each SDK seems to vendor them in

https://github.com/aws/aws-sdk-js-v3/tree/main/codegen/sdk-codegen/aws-models
https://github.com/aws/aws-sdk-go-v2/tree/main/codegen/sdk-codegen/aws-models

So referencing those for the value within the spec is one approach. It looks too tedious though, so will check with the SDKs team if we can simplify this.

@Oberon00
Copy link

Oberon00 commented Aug 5, 2021

so will check with the SDKs team if we can simplify this.

@anuraaga And in the meantime, should we update the spec to require normalization to PascalCase, or clarify that the raw value should be used even if different languages may have different results in the meantime? Or do we leave it unclear until you got feedback from the SDK teams?

@anuraaga
Copy link

anuraaga commented Aug 5, 2021

@Oberon00 Sure I filed open-telemetry/opentelemetry-specification#1851 to provide a fallback for now.

@blumamir
Copy link
Contributor

blumamir commented Aug 8, 2021

@anuraaga @Oberon00 - thanks for bringing it up and creating the relevant fix in the spec.

After the spec update, I believe that we are spec compliant in regards to rpc attributes.

Let's wait for the spec PR to be merged and then we can close this issue, right? unless someone spots something else that needs our attention

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants