Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add example configuration #6

Closed
wants to merge 3 commits into from
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
212 changes: 212 additions & 0 deletions examples/config.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,212 @@
# This demonstrates a file configuration that falls out naturally from the SDK configuration options defined in the specification.
# The configuration shown is a "kitchen sink" example, with all the available options shown for demonstration purposes.

exporters:
otlp/exporterA:
protocols:
grpc:
endpoint: http://localost:4317
headers:
# TODO: Replace with environment variable when parsing to avoid storing secret in plain text
api-key: 1234
Comment on lines +10 to +11
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# TODO: Replace with environment variable when parsing to avoid storing secret in plain text
api-key: 1234
# TODO: Replace with environment variable when parsing to avoid storing secret in plain text
api-key: ${VENDOR_API_KEY}

Would this be a possibility, like the collector config allows it today? The only concern I have is that there are already environment variables for configuration, so this might get confusing

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wouldn't want that because it would mean each implementation checking values for the ${...} pattern and then replacing it -- and would they then also have to implement defaults, ${KEY:-default}? However, if Cue is used for the schema then users can, if they choose, use cue to generate their yaml config from a Cue file and replace env variables.

compression: gzip
timeout_millis: 30_000
logging:

# The resource, can be used across providers
resources:
resource/resourceA:
# List of enabled resource detectors. Each detector has a name (FQCN for java), and an optional list of excluded keys. Detector name supports wildcard syntax. Detectors are invoked and merged sequentially. Static attributes and schema_url comprise a resource which is merged last.
detectors:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sanketmehta28, @srikanthccv, please give this a look

- name: com.domain.resources.CustomResourceProvider
- name: io.opentelemetry.sdk.extension.resources.*
excluded_attribute_keys:
# Exclude process.command_line, which is often verbose and may contain secrets
- process.command_line
# List of static resource attribute key / value pairs.
attributes:
service.name: my-service
service.instance.id: 1234
# The resource schema URL.
schema_url: http://schema.com

# Limits
limits:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you imagine resources / limits applying to configuration outside the sdk block? If not, shouldn't they be nested under SDK? Alternatively, maybe all the top level fields of the sdk block can be promoted to the top level.

For some context, I originally nested everything under the sdk block because I imagined the scope expanding to configuration of instrumentation later. We currently don't have any examples of instrumentation configuration that is cross-language so its hard to imagine what that might entail, but those options felt distinct from SDK configuration.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternatively, maybe all the top level fields of the sdk block can be promoted to the top level

I would be ok with this idea. It would probably be a good idea to capture some example instrumentation/language specific options in here to get a sense for what that looks like.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An example that could be used is in Erlang has a set of options for the "span sweeper" we have in the SDK:

{sweeper, #{interval => integer(),
           strategy => end_span | drop,
           span_ttl => integer(),
           storage_size => integer()}}

https://github.com/open-telemetry/opentelemetry-erlang/tree/main/apps/opentelemetry#span-sweeper

Been meaning to update the environment variables to OTEL_ERLANG_*, but then have to support both for backwards compatibility :(

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Opened #8 to track. I'm going to go through the otel java agent and try to compile some examples.

attributes:
value_length: 0 # OTEL_ATTRIBUTE_VALUE_LENGTH_LIMIT
count: 128 # OTEL_ATTRIBUTE_COUNT_LIMIT
spans:
attributes:
value_length: 0 # OTEL_SPAN_ATTRIBUTE_VALUE_LENGTH_LIMIT
count: 128 # OTEL_SPAN_ATTRIBUTE_COUNT_LIMIT
event:
count: 128 # OTEL_SPAN_EVENT_COUNT_LIMIT
attributes:
count: 128 # OTEL_EVENT_ATTRIBUTE_COUNT_LIMIT
link:
count: 128 # OTEL_SPAN_LINK_COUNT_LIMIT
attributes:
count: 128 # OTEL_LINK_ATTRIBUTE_COUNT_LIMIT

# SDK configuration.
sdk:
# Whether the SDK is enabled or not.
disabled: false
# Configure SDK logging.
logging:
level: info
# Tracer provider configuration.
tracer_provider:
resource: resource/resourceA # references identifier from resources section
# List of span processors, to be added sequentially. Each span processor has a name and args used to configure it.
span_processors:
# Add simple span processor configured to export with the logging exporter.
- name: simple
args:
# Simple span processor takes exporter as an arg
exporter: logging # references identifier from exporters section
# Add batch span processor configured to export with the OTLP exporter.
- name: batch
args:
# Batch span processor takes exporter as an arg
exporter: otlp/exporterA # references identifier from exporters section
# Configure batch span processor batch size and interval settings.
max_queue_size: 100
scheduled_delay_millis: 1_000
export_timeout_millis: 30_000
max_export_batch_size: 200
# The sampler. Each sampler has a name and args used to configure it.
sampler:
name: parentbased
args:
# The parentbased sampler takes root_sampler as an arg, is another sampler itself.
root_sampler:
name: traceidratio
args:
ratio: 0.01
# Meter provider configuration.
meter_provider:
resource: resource/resourceA # references identifier from resources section

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can the resource be different in the TracerProvider/MeterProvider/LoggerProvider?

If not (in Erlang I know it is the same between all three) then I think its an argument for making resource top level and maybe not actually defining "providers" in the config.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But it's possible to configure multiple [Tracer|Meter|Logger]Provider for a single application. Will this not be supported by the configuration?

Copy link
Collaborator Author

@codeboten codeboten Nov 4, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I realize that this example does not account for this currently, maybe this is a documented limitation of configuration usage?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yea, sorry, this was discussed briefly in slack and @jack-berg agreed we should not support that. But if we do support it then yea, need to have provider sections.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think configuring multiple [Tracer|Meter|Logger]Providers from a single configuration file is not right. If multiple can exist, then they need to be assigned some sort of identifier and be accessible to application code by that identifier. That concept doesn't exist in the SDK and I don't think we should invent it.

Talked about this in slack, but if someone wants multiple providers, they can create multiple configuration files, and have them parsed programatically:

OpenTelemetry openTelemetry1 = OpenTelemetryConfigurator.configure("./sdk1.yaml");
OpenTelemetry openTelemetry2 = OpenTelemetryConfigurator.configure("./sdk2.yaml");

Then the caller is responsible for wiring the multiple OpenTelemetry instances into their app as they see fit.

One config file = one configured SDK is a good assumption.

As for having a different resource per provider, I'm inclined to not support that in the file config, and therefore make resource a top level option. OpenTelemetry is a unified approach to traces / metrics / logs - the various telemetry signals are supposed to complement each other (examples include trace context being propagated to exemplars and logs). Resource is supposed to be a bag of identifying and descriptive attributes defining the source of the telemetry. Supporting different resources makes it easier to break the notion of complementary signals, which I think we should discourage.

# List of metric readers. Each metric reader has a name and args used to configure it.
metric_readers:
# Add periodic metric reader configured to export with the logging exporter using the default interval settings.
- name: periodic
args:
# Periodic metric reader takes exporter as an arg
exporter: logging # references identifier from exporters section
# Add periodic metric reader configured to export with the otlp exporter every 5_000 ms.
- name: periodic
args:
# Periodic metric reader takes exporter as an arg, which is composed of a name an args used to configure it.
exporter: otlp/exporterA # references identifier from exporters section
# Configure periodic metric reader interval.
interval_millis: 5_000
# List of views. Each view consists of a selector defining criteria for which instruments are selected, and a view defining the resulting metric.
views:
# Add a "kitchen sink" view, using all selector fields, and using all configurable aspects of the view.
- selector:
# Select instruments with this name, including wildcard matching.
instrument_name: my-instrument
# Select instruments with this type.
instrument_type: COUNTER
# Select instruments with this meter name.
meter_name: my-meter
# Select instruments with this meter version.
meter_version: 1.0.0
# Select instruments with this meter schema url.
meter_schema_url: http://example.com
view:
# Change the metric name.
name: new-instrument-name
# Change the metric description.
description: new-description
# Change the aggregation. In this case, use an explicit bucket histogram with bucket boundaries [1.0, 2.0, 5.0].
aggregation:
name: explicit_bucket_histogram
args:
bucket_boundaries: [1.0, 2.0, 5.0]
# List of attribute keys to retain. Keys included on measurements and not in this list will be ignored.
attribute_keys:
- foo
- bar
# Add a simpler view, which configures the drop aggregation for instruments whose name matches "*.server.duration".
- selector:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why isn't selector under view. Or, I suppose if it were moved to view there would also be no need for the view key since everything under views would be a view.

- selector:
    instrument_name: "*.server.duration"
  aggregation:
    name: drop

instrument_name: "*.server.duration"
view:
aggregation:
name: drop
# Logger provider configuration.
logger_provider:
resource: resource/resourceA # references identifier from resources section
# List of log processors, to be added sequentially. Each log processor has a name and args used to configure it.
log_record_processors:
# Add batch log processor configured to export with the OTLP exporter.
- name: batch
args:
# Batch log processor takes exporter as an arg
exporter: otlp/exporterA # references identifier from exporters section
# Configure batch log processor batch size and interval settings.
max_queue_size: 50
scheduled_delay_millis: 1_000
export_timeout_millis: 30_000
max_export_batch_size: 200
# List of context propagators. Each propagator has a name and args used to configure it. None of the propagators here have configurable options so args are not demonstrated.
propagators:
- name: tracecontext
- name: baggage
# Using examples provided in https://github.com/MrAlias/otel-schema/issues/8
instrumentation:
java:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Java Metric Insight has a fairly complex complication which ideally should be merged into this config file as well:

https://github.com/open-telemetry/opentelemetry-java-instrumentation/blob/main/instrumentation/jmx-metrics/javaagent/README.md

cc @PeterF778, @trask

http:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is the same across python&java and could be implemented in other languages as well, so having a language-agnostic section in "instrumentation" would be something I would like seeing.

capture_headers:
client:
request: # otel.instrumentation.http.capture-headers.client.request
enabled: true
headers:
- "User-Agent"
response: # otel.instrumentation.http.capture-headers.client.response
enabled: false
server:
request: # otel.instrumentation.http.capture-headers.server.request
enabled: false
response: # otel.instrumentation.http.capture-headers.server.response
enabled: true
headers:
- "Content-Type"
python:
falcon:
enabled: false # OTEL_PYTHON_DISABLED_INSTRUMENTATIONS
excluded_urls: # OTEL_PYTHON_FALCON_EXCLUDED_URLS
- http://falcon.com/healthcheck
fastapi:
excluded_urls: # OTEL_PYTHON_FASTAPI_EXCLUDED_URLS
- http://fastapi.com/healthcheck
django:
request_attributes: # OTEL_PYTHON_DJANGO_TRACED_REQUEST_ATTRS
- "content_type"
- "path_info"
# language specific options
python:
context: "contextvars" # OTEL_PYTHON_CONTEXT
id_generator:
name: "random" # OTEL_PYTHON_ID_GENERATOR
# possibly an id generator could require arguments

# what would setting these mean for SDK
# configuration? would it be parsed by an alternate
# SDK?
tracer_provider: # OTEL_PYTHON_TRACER_PROVIDER
meter_provider: # OTEL_PYTHON_METER_PROVIDER
logger_provider: # OTEL_PYTHON_LOGGER_PROVIDER
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm... what do these mean?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I see... you'd use these to provide an alternative implementation of the SDK?

Configuring an alternative SDK implementation should probably be outside of scope for a file based config scheme.


logging_auto_instrumentation_enabled: false # OTEL_PYTHON_LOGGING_AUTO_INSTRUMENTATION_ENABLED

excluded_url: # OTEL_PYTHON_EXCLUDED_URLS
- http://excludeddomain.com/healthcheck
java:
resource_providers: # otel.java.enabled.resource-providers
enabled:
- providerA
disabled:
- providerB
...