Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Require service.instance.id and define how to generate it. #3222

Closed

Conversation

jsuereth
Copy link
Contributor

@jsuereth jsuereth commented Feb 16, 2023

Fixes ##136

Changes

  • Mark service.instance.id as required by SDKs
  • Define algorithm for SDKs to provide service.instance.id
  • Leave enough room for better service.instance.id algorithms on known platforms.

For full rationale, please read the full proposal detailed here

@jsuereth jsuereth requested review from a team February 16, 2023 20:29
Comment on lines +56 to +59
- When the SDK is running in an environment where a non-ambiguous IP
address exists, the ID should be set to this IP address.
- When the environment the SDK targets provides a stable identifier
matching the goals of `service.instance.id`, then this may be used.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like these are pretty vague requirements. As a maintainer, I have no idea how I would go about implementing these.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My problem is not that they are vague, but that they are quite arbitrary. E.g., why does IP address (2) have a priority over other possible stable native IDs (3)? Not to mention that with stacking of containers, the IP clearly does not meet the uniqueness requirement.

I think the definition at L38 already lays out the requirements for instance ID. The exact algorithm should not be required, but can be a recommendation (sans the IP part).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like these are pretty vague requirements. As a maintainer, I have no idea how I would go about implementing these.

The IP portion is vague. Figured in java we would use InetAddress.getLocalHost().getHostAddress(), with some check to determine if the IP address is ambiguous. The full proposal includes this additional text:

- 127.0.0.1/localhost is considered ambiguous.
- Multiple available IP addresses are considered ambiguous.

I do agree with @yurishkuro that "with stacking of containers, the IP clearly does not meet the uniqueness requirement".

why does IP address (2) have a priority over other possible stable native IDs (3)

What about adjusting the algorithm to allow other non-custom resource detectors to provide a service.instance.id? Currently it says "If the user has provided a service.instance.id, via environment variable, configuration or custom resource detection, this will always be used and honored over generated ids.". We could add a step to the algo to allow built-in resource detectors to provide candidates for service.instance.id. For example, container.id seems like a good fallback if the user has not defined their own.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jsuereth any chance you have time to make a couple small tweaks to this PR to reflect this feedback? IMO, this is really close.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah. I wanted to get the other PRs through (for which this relies on, IMO). Give me a few days to retweak thewording and add more justification on WHY we need this. There's a lot of good concerns to address, and I want to make sure the updated PR addresses all of them.

Unfortunately, can't mark this as draft due to already having one approval.

@arminru arminru added area:semantic-conventions Related to semantic conventions spec:resource Related to the specification/resource directory labels Feb 21, 2023
Comment on lines +50 to +51
SDKs are required to follow the following algorithm when generating
`service.instance.id`:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we would merge any requirement like this, we will need to allow for plenty of time for our SDKs to adopt the change and gather feedback on how this works in practice before being able to call it stable.
I think we'll therefore need to carve this out of #3177 and #3202 or we'd otherwise be setting a last-minute change in stone before it's broadly implemented and verified.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm going to update this PR so that the algorithmic part is carved out as its own thing that can be stabilized independently of the meaning of service.instance.id

- If the user has provided a `service.instance.id`, via environment
variable, configuration or custom resource detection, this will
always be used and honored over generated ids.
- When the SDK is running in an environment where a non-ambiguous IP
Copy link
Member

@Oberon00 Oberon00 Feb 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is impossible to detect such environments in general. Or do you have a concrete example?

More importantly, I think the IP cannot be used, because it identifies the host, not the service. Multiple different services could be running with the same IP (on the same host, or in different private networks but still reporting to the same backend). I think the id is supposed to be unique without combining it with service.name.

- When the environment the SDK targets provides a stable identifier
matching the goals of `service.instance.id`, then this may be used.
- When no other source is available the SDK MUST generate an ID. This
ID SHOULD follow version 1, 4 or 5 of RFC 4122.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we define this UUID stuff as part of the algorithm to generate service.instance.id, we should delete it from line 78. I.e. delete the line that reads:

If the service has no inherent unique ID that can be used as the value of this attribute it is recommended to generate a random Version 1 or Version 4 RFC 4122 UUID (services aiming for reproducible UUIDs may also use Version 5, see RFC 4122 for more recommendations).

SDKs are required to follow the following algorithm when generating `service.instance.id`:
- If the user has provided a `service.instance.id`, via environment
variable, configuration or custom resource detection, this will
always be used and honored over generated ids.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
always be used and honored over generated ids.
always be used and honored over generated IDs.

- When the environment the SDK targets provides a stable identifier
matching the goals of `service.instance.id`, then this may be used.
- When no other source is available the SDK MUST generate an ID. This
ID SHOULD follow version 1, 4 or 5 of RFC 4122.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
ID SHOULD follow version 1, 4 or 5 of RFC 4122.
ID SHOULD follow version 1, 4 or 5 of [RFC 4122](https://www.ietf.org/rfc/rfc4122.txt) UUIDs.

@github-actions
Copy link

This PR was marked stale due to lack of activity. It will be closed in 7 days.

@github-actions github-actions bot added the Stale label Mar 16, 2023
@github-actions
Copy link

Closed as inactive. Feel free to reopen if this PR is still being worked on.

@github-actions github-actions bot closed this Mar 24, 2023
@jsuereth jsuereth reopened this Apr 4, 2023
@@ -31,6 +31,7 @@ groups:
examples: ["Shop"]
- id: instance.id
type: string
requirement_level: required
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this should be required. I understand that generating the ID can be done in an easy way and built into the SDK, but that is not enough of an argument to make this a required attribute IMHO.

@github-actions
Copy link

Closed as inactive. Feel free to reopen if this PR is still being worked on.

@jpkrohling
Copy link
Member

This is being followed up here: open-telemetry/semantic-conventions#311

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:semantic-conventions Related to semantic conventions spec:resource Related to the specification/resource directory Stale
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants