Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Exclude URLs from Tracing #1060

Open
jakob-o opened this issue Aug 19, 2020 · 59 comments · Fixed by open-telemetry/opentelemetry-java-contrib#1440
Open

Exclude URLs from Tracing #1060

jakob-o opened this issue Aug 19, 2020 · 59 comments · Fixed by open-telemetry/opentelemetry-java-contrib#1440
Labels
enhancement New feature or request

Comments

@jakob-o
Copy link

jakob-o commented Aug 19, 2020

Is your feature request related to a problem? Please describe.
As already mentioned here open-telemetry/opentelemetry-specification#173 I'd like to be able to exclude or sample a list of URLs / URL-Patterns from instrumentation. In my case particularly to avoid generating many events from health- and liveness-checks.

Describe the solution you'd like
I opened the issue open-telemetry/opentelemetry-java#1552 to discuss if / how tracing might be disabled from instrumentation.
To my knowledge there currently is no API / SDK method to disable tracing centrally on the context.
If I understood correctly a (maybe temporary) solution might be to create a non-recording / invalid span in the HTTP instrumentation, which due to the ParentOrElse-Sampler, would lead to ignoring child spans as well, if the request matches a URL pattern.
Any hint to where this would be architecturally appropriately implemented is highly appreciated.
Maybe in the HttpServerTracer?

Describe alternatives you've considered
We already attempted to use the otel.trace.classes.exclude but only succeeded in completely disabling WebMvc instrumentation.

/CC @gabrielthunig @spaletta

@iNikem
Copy link
Contributor

iNikem commented Aug 21, 2020

If I understood correctly a (maybe temporary) solution might be to create a non-recording / invalid span in the HTTP instrumentation, which due to the ParentOrElse-Sampler, would lead to ignoring child spans as well, if the request matches a URL pattern.
Any hint to where this would be architecturally appropriately implemented is highly appreciated.
Maybe in the HttpServerTracer

That seems like a reasonable approach :)

@iNikem iNikem added enhancement New feature or request release:after-ga labels Aug 21, 2020
@anuraaga
Copy link
Contributor

Been trying auto instrumentation in a container lately, and was slightly annoyed myself with tracing of health check. Then a customer trying it out independently had the same feedback - it's exactly the kind of input we're hoping for in trials :) So I may actually mark this required for GA, the UX is impacted a lot with having no control of tracing by URL pattern.

@pavolloffay
Copy link
Member

I have started looking into this.

Is there a proposed design for this? In OpenTracing this was an instrumentation feature. The instrumentation check if the URL matches exclude pattern if yes then the span wasn't created. However if the excluded URL uses another instrumentation (or makes downstream call) that would create a span. The question is whether we want to exclude just specific URLs or the whole trace starting at that URL.

@iNikem
Copy link
Contributor

iNikem commented Sep 24, 2020

Yes, we have a proposal right in the task's description:

If I understood correctly a (maybe temporary) solution might be to create a non-recording / invalid span in the HTTP instrumentation, which due to the ParentOrElse-Sampler, would lead to ignoring child spans as well, if the request matches a URL pattern.
Any hint to where this would be architecturally appropriately implemented is highly appreciated.
Maybe in the HttpServerTracer?

This should lead to ignoring the whole subtree starting from that SERVER span.

@pavolloffay
Copy link
Member

+1 I think that is the right way to go open-telemetry/opentelemetry-specification#173 (comment). It would also make sense to have a consolidated config property for this.

@iNikem what is the API to create non-recording span? In OT there was sampling.priority=bool tag that could be applied on the span builder https://github.com/opentracing/specification/blob/master/semantic_conventions.yaml#L26

@iNikem
Copy link
Contributor

iNikem commented Sep 24, 2020

I think the right way is to use one of the factory methods on io.opentelemetry.trace.DefaultSpan.

@pavolloffay
Copy link
Member

Yeah, I wanted to avoid having two paths of span creation.

@anuraaga
Copy link
Contributor

Sorry if double-spam, I thought I had already posted this. How about we have a special Sampler itself configured that delegates to the default, except for when the path matches the allow list, then it's ParentOnly? It means we need to refactor to make sure our tracers set attributes on Span.Builder instead of Span - a little annoying but we should have been doing that already so maybe good motivation for it.

@anuraaga
Copy link
Contributor

Note that DefaultSpan might get renamed to something that would be naming-wise not a good fit with what we want to do here open-telemetry/opentelemetry-specification#994 (review)

@iNikem
Copy link
Contributor

iNikem commented Sep 24, 2020

refactor to make sure our tracers set attributes on Span.Builder instead of Span

Yes, we want to do that eventually.

@cemo
Copy link

cemo commented Dec 20, 2020

Is there a way now to exclude health check traces? I checked processors but could not find any solution too.

@iNikem
Copy link
Contributor

iNikem commented Dec 21, 2020

No, this functionality is not yet implemented.

@irl-segfault
Copy link

bump. Is there any workaround here in the meantime? Sampling of health checks is not ideal.

@iNikem
Copy link
Contributor

iNikem commented Aug 4, 2021

bump. Is there any workaround here in the meantime? Sampling of health checks is not ideal.

The only known workaround is to write custom sampler.

But I have plans to address this issue during the next month or so.

@iNikem
Copy link
Contributor

iNikem commented Oct 1, 2021

The corresponding Sampler has been implemented in the contrib repo. It can be added to your deployment using extension mechanism. There is no immediate plans to add that sampler into this distribution, as this requires changes in Otel Specification and that requires some effort.

@trask
Copy link
Member

trask commented Apr 3, 2023

for anyone who would like to work on adding this functionality to the base otel javaagent, here's a very high-level outline:

use a yaml configuration file to dynamically configure the RuleBasedRoutingSampler during startup of the javaagent

use a system property, e.g. -Dotel.sampling.rules.config=..., that users can use to specify the location of the yaml configuration file (later we will have a single yaml configuration file and we can consolidate, see recently merged configuration otep)

check out the jmx-metrics and metric view yaml configuration for some inspiration

@vmaleze
Copy link

vmaleze commented Apr 28, 2023

Based on @jack-berg examples, and the possibility to embed extension directly into the agent, I've created this simple project that provides both the java agent and the docker auto instrumentation that will ignore /health and /metrics calls.

@turesheim
Copy link

turesheim commented Aug 9, 2023

We have similar requirements and wrote an extension that allows you to dynamically configure what sampler to use and add filtering for span creation. Configuration can be done by simply changing the configuration file while the application is running, or also by using the optional REST service for configuring multiple agents. The service also exposes Prometheus compatible metrics. You might find it useful. See https://github.com/domstolene/da-otel-agent for details.

@trask trask removed the priority:p3 label Aug 23, 2023
@szilaszi
Copy link

szilaszi commented Sep 8, 2023

I did a work-around for this having a look at how the OpenTelemetryAutoConfiguration class builds the SdkTracerProvider where the sampler is being set for the bean in Spring.

I duplicated the code in my bean configuration after I have seen the original bean is conditional and created a custom a sampler, which I used to drop any health check related traces.

The caveat is that the actual health check traces are coming with no names to the sampler and denying all spans with no names may drop some unintended traces... For now we are just only doing a PoC for a product, so take this as it is :)

Here is the sample for this:

    @Bean
    SdkTracerProvider otelSdkTracerProvider(Environment environment, ObjectProvider<SpanProcessor> spanProcessors,
                                            Sampler sampler, ObjectProvider<SdkTracerProviderBuilderCustomizer> customizers) {
        String applicationName = environment.getProperty("spring.application.name", "application");
        SdkTracerProviderBuilder builder = SdkTracerProvider.builder()
                .setSampler(new HealthCheckExclusionSampler())
                .setResource(Resource.create(Attributes.of(ResourceAttributes.SERVICE_NAME, applicationName)));
        spanProcessors.orderedStream().forEach(builder::addSpanProcessor);
        customizers.orderedStream().forEach((customizer) -> customizer.customize(builder));
        return builder.build();
    }

    /**
     * A sampler implementation that excludes certain health check spans from being sampled.
     * Note: I ignore all spans without a name, reason being that the actuator health checks apparently have no names.
     */
    static class HealthCheckExclusionSampler implements Sampler {

        @Override
        public SamplingResult shouldSample(Context parentContext, String traceId, String name, SpanKind spanKind, Attributes attributes, List<LinkData> parentLinks) {
            if (name.contains("/actuator/health") || name.contains("grpc.health.v1.Health/Check") || name.contains("<unspecified span name>")) {
                return SamplingResult.drop();
            }
            return SamplingResult.recordAndSample();
        }

        @Override
        public String getDescription() {
            return "HealthCheckExclusionSampler";
        }
    }

}

@scprek
Copy link
Contributor

scprek commented Sep 28, 2023

We have some services using Micronaut and it has a simple way of excluding HTTP routes https://micronaut-projects.github.io/micronaut-tracing/latest/guide/#http but we have other tech stacks too to deal with.

Also noticed those links in the contr project were broken. Think this is the new place.

https://github.com/open-telemetry/opentelemetry-java-contrib/tree/main/samplers/src/main/java/io/opentelemetry/contrib/sampler

@azunna1
Copy link

azunna1 commented Nov 9, 2023

If you're using a central collector to send traces to your backends, you can filter out those spans - https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/main/processor/filterprocessor/README.md

@trask
Copy link
Member

trask commented Nov 10, 2023

linking related: open-telemetry/oteps#240

@shameekagarwal
Copy link

shameekagarwal commented Nov 17, 2023

my architecture -
app -> otel agent (auto instrumentation) -> otel collector -> observability backend
for now, i wanted to filter out swagger and health check calls (caused due to k8s readiness probe)

solution 1 - filter processor. issue - resulted in orphaned spans
solution 2 - tail sampling processor. till now, looks like it works

@nedcerneckis
Copy link

Is there any update on this?

I understand the concerns raised by @iNikem about the specification being hard to change but this feature has been requested by numerous people for quite some time now.

I think this is a very important and much-needed feature for a lot of users of OpenTelemetry.

@trask Has anyone tried to contribute to the project with this feature? I would like to take a go at implementing it for the Otel Java agent.

@trask
Copy link
Member

trask commented Jan 5, 2024

hi @nedcerneckis!

I understand the concerns raised by @iNikem about the specification being hard to change but this feature has been requested by numerous people for quite some time now.

I don't believe we're blocked by specification work, since we can add this initially as an opt-in experimental feature (bypassing the need for a specification)

Has anyone tried to contribute to the project with this feature? I would like to take a go at implementing it for the Otel Java agent.

not yet, that would be great, check out #1060 (comment) for very high-level sketch that could fit well with existing feature set

@kenfinnigan
Copy link
Member

For Lumigo's distribution I implemented a SamplerCustomizer for use with AutoConfigure, see here.

My approach was to allow urls for client and server to be filter, or separate env vars to filter urls for only client or server spans. The urls are defined as an array of regex, such as [".*/health.*", ".*/actuator.*"]

@nedcerneckis
Copy link

Thank you very much @trask! Much appreciated for the info.

I'm assuming this needs to be a little more advanced than just excluding certain HTTP endpoints and include other types of rules inside this YAML config file?

@EvertonSA
Copy link

EvertonSA commented Feb 16, 2024

Based on @jack-berg examples, and the possibility to embed extension directly into the agent, I've created this simple project that provides both the java agent and the docker auto instrumentation that will ignore /health and /metrics calls.

life savier, for me this is the easiest solution, but I only needed the jar file:

stages:
  - get_jar
  - distribute

variables:
  OTEL_LIB_VERSION: 1.32.0

Get Jar:
  stage: get_jar
  image: ghcr.io/vmaleze/opentelemetry-java-ignore-spans:$OTEL_LIB_VERSION
  script:
    - cp /javaagent.jar ./
  artifacts:
    paths:
      - './javaagent.jar'
    expire_in: 1 week

Deploy JAR:
  image: maven:3.6-openjdk-17-slim
  stage: distribute
  rules:
  script:
    - mvn deploy -DskipTests

@apischan
Copy link

apischan commented Mar 3, 2024

I'm one who also waiting for that feature, but I solved this problem from another side (this approach is probably not for everyone).
The task was to disable actuator healthckecks /actuator/health on tracing UI. So here is what I did.
I've added the filter on opentelemetry-collector's side adding processor filter to force it to skip spans from actuator.

Here is the documentation about processors: https://opentelemetry.io/docs/collector/configuration/#processors
Here is the list of available filters: https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/processor/filterprocessor

In total now I have such configuration on my project (which I'll certainly would like to change on config property when it'll be implemented):

config:
  exporters:
      ...

  service:
    pipelines:
      traces:
        receivers: [otlp]
        processors: [batch, filter/drophttp] # do not forget to add processor otherwise it will not work
        exporters: [otlp]

  processors:
    filter/drophttp:
      error_mode: ignore
      traces:
        span:
          - attributes["http.url"] == "/actuator/health"
          - attributes["url.path"] == "/actuator/health"

I would be appreciated to myself if I would find this comment hours before. 😄

@Kortex
Copy link

Kortex commented Mar 3, 2024

I did the same but I ended up with orphaned spans. How are you dealing with those?

@shameekagarwal
Copy link

my architecture - app -> otel agent (auto instrumentation) -> otel collector -> observability backend for now, i wanted to filter out swagger and health check calls (caused due to k8s readiness probe)

solution 1 - filter processor. issue - resulted in orphaned spans
solution 2 - tail sampling processor. till now, looks like it works

@Kortex, does tail sampling not work?

@Kortex
Copy link

Kortex commented Mar 4, 2024

@shameekagarwal I did not try out tail sampling.

Would it be possible to share your configuration?

@shameekagarwal
Copy link

shameekagarwal commented Mar 4, 2024

i linked my comment, where the code is in a collapsible....expand solution 2

@apischan
Copy link

apischan commented Mar 4, 2024

So I decided to abandon the filter/drophttp processor and after the hours of debugging I've finally end up with the following configuration (thanks to @shameekagarwal):

  processors:
    tail_sampling:
      policies: [
        {
          name: filter_http_url,
          type: string_attribute,
          string_attribute: {
            key: http.url,
            values: [ /actuator/health, /swagger-ui.*, /v3/api-docs.*, /favicon.ico ],
            enabled_regex_matching: true,
            invert_match: true
          }
        },
        {
          name: filter_url_path,
          type: string_attribute,
          string_attribute: {
            key: url.path,
            values: [ /actuator/health, /swagger-ui.*, /v3/api-docs.*, /favicon.ico ],
            enabled_regex_matching: true,
            invert_match: true
          }
        }
      ]

NOTE: there are two of them because in your project can be enabled both spring-web and spring-webmvc tracing and they are using two different keys containing URL http.url and http.path respectively. (personally I decided to disable spring-webmvc tracing)

otel:
  instrumentation:
    spring-web:
      enabled: true
    spring-webmvc:
      enabled: true

Not sure why @shameekagarwal has key http.route, but in my case it is not (Spring Boot 3.2.2; Opentelemetry 1.35.0).
Anyway the good thing that everything is working now. :)

@EvertonSA
Copy link

one comment here, tail sampling works absolutely fantastic for almost all issues here (related to actuator) but I have a SSE stream that can be open for more than 1 hour. Tail sampling can't keep up with such long trace.

@jack-berg
Copy link
Member

Re-opening until lingering issues are resolved which prevent declarative config from being used with the otel java agent: open-telemetry/opentelemetry-java-contrib#1440 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.