Add OpenTelemetry sampling conventions #793

jmacd · 2024-03-05T23:46:08Z

Changes

Introduces 3 conventional attributes to describe sampling in OpenTelemetry collection pipelines. This specification refers to OTEP 235.

Related to the specification change (from OTEP 235) in open-telemetry/opentelemetry-specification#3910.

Related to the specification of about representing trace context in logs in open-telemetry/opentelemetry-specification#3909.

Prototype for a Collector sampler based on these attributes in open-telemetry/opentelemetry-collector-contrib#29720 and [WIP] open-telemetry/opentelemetry-collector-contrib#24811.

Part of open-telemetry/opentelemetry-specification#1413

Merge requirement checklist

CONTRIBUTING.md guidelines followed.
Change log entry added, according to the guidelines in When to add a changelog entry.
- If your PR does not need a change log, start the PR title with [chore]
schema-next.yaml updated with changes to existing conventions.

model/otel/sampling.yaml

docs/otel/sampling.md

jmacd · 2024-03-06T21:46:32Z

@lmolkova @pyohannes Thank you! I realized from your responses that this repository needs to contain more detail, whereas I had been planning on rolling that detail into open-telemetry/opentelemetry-specification#3910 and now I've put some of that text into this PR.

I created a file in the attributes registry and moved the attribute definitions there.

I am hoping to justify the use of "sampling" as a prefix, for historical reasons. I removed the tag and decided to wait for more feedback before introducing a requirements-level statement, because these attributes are somewhat unusual.

Since spans record tracestate, unlike log records, we are proposing two new attributes that are only useful for log records corresponding with fields that are defined for use in tracestate. I had been using the tag to de-select those two attributes in the span attributes and leave them in the logs attributes. Now, there is just one section called "Sampling attributes" and the docs explain this.

lmolkova

Thanks a lot for the additional context, I still have some questions and suggestions

docs/sampling/README.md

model/registry/sampling.yaml

.chloggen/793.yaml

kentquirk

I'm very happy about this document.

Co-authored-by: Kent Quirk <[email protected]>

jmacd · 2024-03-25T23:49:28Z

Reviewers: I see no reason to continue promoting sampling.priority as a convention, despite its definition in OpenTracing. I found a very different definition for a near-identical concept in the OpenTelemetry contrib-collector probabilistic sampler processor for logs. If we used the logs processor definition, "sampling.priority" would equal a request to sample at a particular percentage, where there is an implicit conjunction. In the terminology of OTEP 250, the priority mechanism is the FIRST in the conjunction (making it possible to drop before evaluating the second).

Therefore, I have revised this PR only to specify sampling.threshold and sampling.randomness.

…entions into jmacd/sampling_convs

trask · 2024-03-26T14:16:22Z

I have revised this PR only to specify sampling.threshold and sampling.randomness.

it looks like you need to regenerate the markdown to fully remove sampling.priority

model/trace/sampling.yaml

model/logs/sampling.yaml

docs/attributes-registry/sampling.md

docs/sampling/README.md

jmacd · 2024-03-27T18:42:52Z

Updated: sampling.priority is really removed, now.
This leaves the question in #793 (review) (and #793 (comment)) about whether we agree to use log record attributes to convey metadata about the collection path.

PeterF778 · 2024-03-29T21:05:30Z

docs/sampling/README.md

+The OpenTelemetry sampling decision is defined in terms of a Threshold
+value and a Randomness value, each containing 56 bits of information.
+
+A constant known as _maximum adjusted count_ (`MaxAdjustedCount`),


It could be just me, but I think that max-something suggests inclusiveness, so this can be confusing. How about AdjustedCountLimit?

I am fine with MaxAdjustedCount. It is inclusive with respect to the adjusted count. However, I understand that it can be a little confusing as it is also used as an exclusive upper limit for the threshold and the random value.

docs/sampling/README.md

oertl · 2024-04-02T18:25:44Z

docs/sampling/README.md

+The OpenTelemetry sampling decision is defined in terms of a Threshold
+value and a Randomness value, each containing 56 bits of information.
+
+A constant known as _maximum adjusted count_ (`MaxAdjustedCount`),


I am fine with MaxAdjustedCount. It is inclusive with respect to the adjusted count. However, I understand that it can be a little confusing as it is also used as an exclusive upper limit for the threshold and the random value.

kentquirk

A minor wording clarification, but otherwise this looks good.

docs/sampling/README.md

kentquirk · 2024-04-05T15:11:38Z

docs/sampling/README.md

+### Sampling randomness
+
+When determining the Randomness value from an item of telemetry,
+sampler implementations SHOULD:


Suggested change

sampler implementations SHOULD:

sampler implementations SHOULD evaluate the following in order:

github-actions · 2024-04-21T03:18:52Z

This PR was marked stale due to lack of activity. It will be closed in 7 days.

lmolkova · 2024-04-22T22:38:40Z

@jmacd do we have a consensus on this PR within the sampling WG?

github-actions · 2024-05-08T03:20:01Z

This PR was marked stale due to lack of activity. It will be closed in 7 days.

github-actions · 2024-05-15T03:20:37Z

Closed as inactive. Feel free to reopen if this PR is still being worked on.

jmacd · 2024-05-31T16:22:23Z

Miscellaneous updates:

Apologies for stalling. I am re-opening this with the intention to start it moving again, next week.

Related: @kalyanaj is working on minor revisions to OTEP 235 based on our analysis of the W3C tracestate level 2 "random" flag.

Related: The semantic conventions here go slightly beyond OTEP 235, which focuses on tracestate and does not explicitly document our approach to logs sampling. The new semantic conventions in this PR could be added back into OTEP 235 as there is no disagreement in the Sampling SIG, or we could just approve them here -- I am referring to the new sampling.randomness and sampling.threshold attributes.

The OTel Collector prototype for this is in review in its final stage. If we don't get thus work approved and merged soon, the work done there will be at-risk for near-future breaking changes. Please see open-telemetry/opentelemetry-collector-contrib#31894.

Co-authored-by: Kent Quirk <[email protected]>

jmacd · 2024-06-12T23:14:51Z

I will re-open this PR soon, will close it for now.
I think we might want to submit another OTEP with some missing details in OTEP 235 and/or see open-telemetry/oteps#261.

In particular, the review for open-telemetry/opentelemetry-collector-contrib#31894 raises the question of whether the behavior adopted in probabilisticsamplerprocessor should be a semantic conventional approach, namely:

If you have a hashing algorithm to construct some number of bits used for a consistent threshold-based approach and you want to record your sampling decision in a collection-path sampler such as the referenced component, you SHOULD synthesize an rv corresponding with your threshold-based decision (interpreted as a rejection threshold), which along with th means you can interpret sampling probability for a wide family of samplers (including the legacy probabilisticsamplerprocessor technique). See https://github.com/open-telemetry/opentelemetry-collector-contrib/pull/31894/files#r1637230966.

…rt OTEP 235) (#31894) **Description:** Creates new sampler modes named "equalizing" and "proportional". Preserves existing functionality under the mode named "hash_seed". Fixes #31918 This is the final step in a sequence, the whole of this work was factored into 3+ PRs, including the new `pkg/sampling` and the previous step, #31946. The two new Sampler modes enable mixing OTel sampling SDKs with Collectors in a consistent way. The existing hash_seed mode is also a consistent sampling mode, which makes it possible to have a 1:1 mapping between its decisions and the OTEP 235 randomness and threshold values. Specifically, the 14-bit hash value and sampling probability are mapped into 56-bit R-value and T-value encodings, so that all sampling decisions in all modes include threshold information. This implements the semantic conventions of open-telemetry/semantic-conventions#793, namely the `sampling.randomness` and `sampling.threshold` attributes used for logs where there is no tracestate. The default sampling mode remains HashSeed. We consider a future change of default to Proportional to be desirable, because: 1. Sampling probability is the same, only the hashing algorithm changes 2. Proportional respects and preserves information about earlier sampling decisions, which HashSeed can't do, so it has greater interoperability with OTel SDKs which may also adopt OTEP 235 samplers. **Link to tracking Issue:** Draft for open-telemetry/opentelemetry-specification#3602. Previously #24811, see also open-telemetry/oteps#235 Part of #29738 **Testing:** New testing has been added. **Documentation:** ✅ --------- Co-authored-by: Juraci Paixão Kröhling <[email protected]>

jmacd added 2 commits March 5, 2024 15:13

Add OpenTelemetry sampling conventions

e767013

chlog

0126c1d

jmacd marked this pull request as ready for review March 5, 2024 23:52

jmacd requested review from a team March 5, 2024 23:52

github-actions bot assigned arminru Mar 5, 2024

lint

8646a41

This was referenced Mar 6, 2024

Stabilize trace context in non-OTLP formats and define casing open-telemetry/opentelemetry-specification#3909

Merged

Update Trace specification to use W3C Trace Context Level 2; add Random flag open-telemetry/opentelemetry-specification#3924

Closed

lmolkova reviewed Mar 6, 2024

View reviewed changes

pyohannes reviewed Mar 6, 2024

View reviewed changes

docs/otel/sampling.md Outdated Show resolved Hide resolved

jmacd added 2 commits March 6, 2024 11:40

wip

157e07b

move into registry

f3c5da1

jmacd added 4 commits March 6, 2024 13:54

address intended user for each attribute

1f1ca45

address term implementations

b5a65c4

give user perspective

a4c2068

clarify attributes can be used for spans

b1574bd

lmolkova reviewed Mar 6, 2024

View reviewed changes

jmacd added 3 commits March 6, 2024 14:18

finish sentence

42e47f9

remove some bits

9badfa4

apply suggestion

7e56498

kentquirk reviewed Mar 7, 2024

View reviewed changes

.chloggen/793.yaml Outdated Show resolved Hide resolved

kentquirk approved these changes Mar 7, 2024

View reviewed changes

jmacd and others added 5 commits March 7, 2024 09:05

yamllint

e466de8

Update .chloggen/793.yaml

c50081e

Co-authored-by: Kent Quirk <[email protected]>

merge

73b0571

add a tail-sampler example

4e8870d

toc lint

4a1b1df

jmacd requested a review from tigrannajaryan as a code owner March 7, 2024 17:56

Merge branch 'jmacd/sampling_convs' of github.com:jmacd/semantic-conv…

1cb4153

…entions into jmacd/sampling_convs

joaopgrassi reviewed Mar 26, 2024

View reviewed changes

model/trace/sampling.yaml Outdated Show resolved Hide resolved

model/logs/sampling.yaml Outdated Show resolved Hide resolved

docs/attributes-registry/sampling.md Outdated Show resolved Hide resolved

docs/sampling/README.md Outdated Show resolved Hide resolved

all the way removed

8db652a

PeterF778 reviewed Mar 29, 2024

View reviewed changes

docs/sampling/README.md Show resolved Hide resolved

oertl approved these changes Apr 2, 2024

View reviewed changes

kentquirk approved these changes Apr 5, 2024

View reviewed changes

github-actions bot added the Stale label Apr 21, 2024

github-actions bot removed the Stale label Apr 23, 2024

jmacd mentioned this pull request Apr 24, 2024

Project Tracking: Sampling open-telemetry/opentelemetry-specification#4012

Open

github-actions bot added the Stale label May 8, 2024

github-actions bot closed this May 15, 2024

jmacd reopened this May 31, 2024

jmacd and others added 2 commits May 31, 2024 09:24

Update docs/sampling/README.md

9ccd8d7

Co-authored-by: Kent Quirk <[email protected]>

Update docs/sampling/README.md

157fdca

Co-authored-by: Kent Quirk <[email protected]>

github-actions bot removed the Stale label Jun 1, 2024

jmacd closed this Jun 12, 2024

jmacd mentioned this pull request Jun 18, 2024

Span Metrics connector support for OTEP 235 probability sampling open-telemetry/opentelemetry-collector-contrib#33632

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add OpenTelemetry sampling conventions #793

Add OpenTelemetry sampling conventions #793

jmacd commented Mar 5, 2024 •

edited

Loading

jmacd commented Mar 6, 2024

lmolkova left a comment

kentquirk left a comment

jmacd commented Mar 25, 2024

trask commented Mar 26, 2024

jmacd commented Mar 27, 2024

PeterF778 Mar 29, 2024

oertl Apr 2, 2024

oertl Apr 2, 2024

kentquirk left a comment

kentquirk Apr 5, 2024

github-actions bot commented Apr 21, 2024

lmolkova commented Apr 22, 2024

github-actions bot commented May 8, 2024

github-actions bot commented May 15, 2024

jmacd commented May 31, 2024

jmacd commented Jun 12, 2024 •

edited

Loading

	sampler implementations SHOULD:
	sampler implementations SHOULD evaluate the following in order:

Add OpenTelemetry sampling conventions #793

Add OpenTelemetry sampling conventions #793

Conversation

jmacd commented Mar 5, 2024 • edited Loading

Changes

Merge requirement checklist

jmacd commented Mar 6, 2024

lmolkova left a comment

Choose a reason for hiding this comment

kentquirk left a comment

Choose a reason for hiding this comment

jmacd commented Mar 25, 2024

trask commented Mar 26, 2024

jmacd commented Mar 27, 2024

PeterF778 Mar 29, 2024

Choose a reason for hiding this comment

oertl Apr 2, 2024

Choose a reason for hiding this comment

oertl Apr 2, 2024

Choose a reason for hiding this comment

kentquirk left a comment

Choose a reason for hiding this comment

kentquirk Apr 5, 2024

Choose a reason for hiding this comment

github-actions bot commented Apr 21, 2024

lmolkova commented Apr 22, 2024

github-actions bot commented May 8, 2024

github-actions bot commented May 15, 2024

jmacd commented May 31, 2024

jmacd commented Jun 12, 2024 • edited Loading

jmacd commented Mar 5, 2024 •

edited

Loading

jmacd commented Jun 12, 2024 •

edited

Loading