-
Notifications
You must be signed in to change notification settings - Fork 287
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enable sampling based on upstream sampling decision #1200
Conversation
@@ -31,27 +31,31 @@ Microsoft.ApplicationInsights.DataContracts.ExceptionTelemetry.ItemTypeFlag.get | |||
Microsoft.ApplicationInsights.DataContracts.PageViewPerformanceTelemetry.ItemTypeFlag.get -> Microsoft.ApplicationInsights.DataContracts.SamplingTelemetryItemTypes | |||
Microsoft.ApplicationInsights.DataContracts.PageViewTelemetry.ItemTypeFlag.get -> Microsoft.ApplicationInsights.DataContracts.SamplingTelemetryItemTypes | |||
Microsoft.ApplicationInsights.DataContracts.RequestTelemetry.ItemTypeFlag.get -> Microsoft.ApplicationInsights.DataContracts.SamplingTelemetryItemTypes | |||
Microsoft.ApplicationInsights.DataContracts.ISupportAdvancedSampling.IsSampledOutAtHead.get -> bool |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changed bool IsSampledOutAtHead
(introduced in beta-1) to more elaborate and extendable SamplingDecision ProactiveSamplingDecision
@@ -1,4 +1,4 @@ | |||
#pragma warning disable CA1716, 612, 618 // Namespace naming, obsolete TelemetryConfigration.Active |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these tests became broken on my machine in VS 2019, I fixed them by replacing TelemetryConfiguration.Active
usage with creating config instance per test and carefully disposing configs.
Test/Microsoft.ApplicationInsights.Test/Shared/Metrics/TelemetryConfigurationExtensionsTests.cs
Show resolved
Hide resolved
src/Microsoft.ApplicationInsights/Extensibility/OperationCorrelationTelemetryInitializer.cs
Show resolved
Hide resolved
@@ -0,0 +1,94 @@ | |||
namespace Microsoft.ApplicationInsights |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sampling relies on timestamps and apparently it used UtcNow - fixed to stabilize tests
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changelog.md also
src/Microsoft.ApplicationInsights/Extensibility/OperationCorrelationTelemetryInitializer.cs
Show resolved
Hide resolved
src/Microsoft.ApplicationInsights/Extensibility/OperationCorrelationTelemetryInitializer.cs
Show resolved
Hide resolved
telemetryItem is ISupportAdvancedSampling supportSamplingTelemetry && | ||
supportSamplingTelemetry.ProactiveSamplingDecision == SamplingDecision.None) | ||
{ | ||
supportSamplingTelemetry.ProactiveSamplingDecision = SamplingDecision.SampledIn; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If item was proactively sampled out by our own algorithm (due to experimental flag "proactiveSampling" turned on), then this initializer would not run at all (all initializers are skipped for the sampled out item) and the item we would like to sample in due to activity.Recorded won't receive SampledIn status.
We discussed this offline and it can be addressed by either modifying sampledOut
value in TelemetryClient.Initialize
based on Activity.Recorded or by modifying original value of sampledOut
in HostingDiagnosticsListener OnBeginRequest in ASP.NET Core repo.
Non-blocking as it does not affect the first set of the scenarios PR is trying to address. May become blocking later in the world of all and only W3C correlation :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've played with overriding sampling decision in TelemetryClient.Initialize
a bit and decided that it's not the right place for it.
-
There is a requirement (back-compatibility) that Ms.ApplicationInsights.dll should work even if someone (erroneously) uses it without DiagnosticSource.dll, i.e. all calls to Activity.Current should be guarded with the check that Activity is available
-
This is only needed to override decision made by auto-collector. But auto-collector could respect Activity.Recorded and save base SDK from overriding sampling decision.
So HostingDiagnosticListener should check for Activity.Recoded
…ed on parent context
3a24d05
to
a587d2f
Compare
This change allows respecting upstream sampling decision from W3C trace-context wile sampling.
This is applicable in the following scenarios:
This change will not affect cases when upstream sampling decision is not propagated (upstream is not W3C-enabled).
The first consumer for this change is Azure Functions: they will enable EventHubs instrumentation and start generating 'batching' operations with links and sampled-in decision.
EventHub, ServiceBus and other messaging services and their SDKs support batching: i.e. messages could be batched together to optimize connections and network bandwidth when messages are sent.
It's also a common pattern to process messages in batches. Check out
EventData[]
this is the default EventHub function templateUser can still do per-message processing and may do custom instumentation to help with tracing.
However, we want to provide customers with reasonable tracing features without them changing the pattern of processing or customizing AI telemetry.
So batching pattern:
Similar problem: when OpenTelemtery/Census initiates a transaction, it propagates W3C trace-context that contains a sampling flag. Today AI SDK ignores it, which means that even though the context is propagated and respected, AI will make a brand new sampling decision and if both services sample in with 1% rate, transactions sampled in by both will have 0.01% rate.