-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add example configuration #6
Conversation
Rethinking if I like the name references or not. Why does the collector use this instead of yaml anchors? Is it to allow the feature to be used when writing json for configuration? Feels like unnecessary work on the side of each parser if yaml and/or cue is used and there is built in support to reuse blocks of config. |
schema_url: http://schema.com | ||
|
||
# Limits | ||
limits: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you imagine resources / limits applying to configuration outside the sdk block? If not, shouldn't they be nested under SDK? Alternatively, maybe all the top level fields of the sdk block can be promoted to the top level.
For some context, I originally nested everything under the sdk block because I imagined the scope expanding to configuration of instrumentation later. We currently don't have any examples of instrumentation configuration that is cross-language so its hard to imagine what that might entail, but those options felt distinct from SDK configuration.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alternatively, maybe all the top level fields of the sdk block can be promoted to the top level
I would be ok with this idea. It would probably be a good idea to capture some example instrumentation/language specific options in here to get a sense for what that looks like.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An example that could be used is in Erlang has a set of options for the "span sweeper" we have in the SDK:
{sweeper, #{interval => integer(),
strategy => end_span | drop,
span_ttl => integer(),
storage_size => integer()}}
Been meaning to update the environment variables to OTEL_ERLANG_*
, but then have to support both for backwards compatibility :(
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Opened #8 to track. I'm going to go through the otel java agent and try to compile some examples.
ratio: 0.01 | ||
# Meter provider configuration. | ||
meter_provider: | ||
resource: resource/resourceA # references identifier from resources section |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can the resource be different in the TracerProvider/MeterProvider/LoggerProvider?
If not (in Erlang I know it is the same between all three) then I think its an argument for making resource top level and maybe not actually defining "providers" in the config.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But it's possible to configure multiple [Tracer|Meter|Logger]Provider for a single application. Will this not be supported by the configuration?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I realize that this example does not account for this currently, maybe this is a documented limitation of configuration usage?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah yea, sorry, this was discussed briefly in slack and @jack-berg agreed we should not support that. But if we do support it then yea, need to have provider sections.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think configuring multiple [Tracer|Meter|Logger]Providers from a single configuration file is not right. If multiple can exist, then they need to be assigned some sort of identifier and be accessible to application code by that identifier. That concept doesn't exist in the SDK and I don't think we should invent it.
Talked about this in slack, but if someone wants multiple providers, they can create multiple configuration files, and have them parsed programatically:
OpenTelemetry openTelemetry1 = OpenTelemetryConfigurator.configure("./sdk1.yaml");
OpenTelemetry openTelemetry2 = OpenTelemetryConfigurator.configure("./sdk2.yaml");
Then the caller is responsible for wiring the multiple OpenTelemetry instances into their app as they see fit.
One config file = one configured SDK is a good assumption.
As for having a different resource per provider, I'm inclined to not support that in the file config, and therefore make resource a top level option. OpenTelemetry is a unified approach to traces / metrics / logs - the various telemetry signals are supposed to complement each other (examples include trace context being propagated to exemplars and logs). Resource is supposed to be a bag of identifying and descriptive attributes defining the source of the telemetry. Supporting different resources makes it easier to break the notion of complementary signals, which I think we should discourage.
Some additional prior art to consider from the OpenTelemetry Operator:
|
Related to my comment about removing |
I guess that actually follows the environment variable pattern where there is |
This limits the configuration scenarios that can be expressed. How do I say that I want to export to zipkin and otlp for traces, with zipkin every 30 second and otlp every 60 seconds? IMO the configuration should fall out from the programatic configuration of the SDK. Deviating from this means being more likely to run into things that can be expressed programmatically but not in a file. Applied to this topic, it means supporting an array of span processors, where the built in simple and batch span processors accept a mandatory exporter as an argument. |
# SDK? | ||
tracer_provider: # OTEL_PYTHON_TRACER_PROVIDER | ||
meter_provider: # OTEL_PYTHON_METER_PROVIDER | ||
logger_provider: # OTEL_PYTHON_LOGGER_PROVIDER |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm... what do these mean?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh I see... you'd use these to provide an alternative implementation of the SDK?
Configuring an alternative SDK implementation should probably be outside of scope for a file based config scheme.
@jack-berg is your issue that you'd have to duplicate the otlp configuration if its a case that you want to use the sae OTLP exporter config for traces, metrics and logs but also a zipkin exporter for traces? |
Not sure whether or not you're in favor of the operator's config, but I'm trying to say that it's not a good one to model after since it doesn't reflect how SDKs are actually configured. Whether a single top level exporter:
Or signal specific exporters:
Neither of these reflect the reality that SDKs aren't configured with exporters. SDKs are configured with processors, where some of the built in processors require an exporter as an argument. Means that you can configure an SDK with custom processors that don't export at all, or with multiple processors each with a different exporter attached. We need to preserve these options in a file based config. |
I don't necessarily disagree with this but do think we'll wind up being considered "hard to configure". Maybe that is just how it has to be. In Erlang for the config file I have it so a simple set of top level options are available, so you could for example still configure the interval of trace export and the protocol used, without knowing how this relates to a tracer providers array of span processors. But the ability to configure each processor in the array is preserved by taking precedence over any processor that would be generated based on top level configuration. It is hairy trying to support both simple and detailed configuration, so I can see only choosing one and going with detailed so nothing is lost. |
The idea of supporting simple and detailed is interesting. Suppose we started by being able to express relatively complicated config. We could always add additional simpler options which effectively are just syntactic sugar for the more complex representation they translate too. An argument for targeting complex config is that the environment based config already has a lot of the simple options covered, so restricting the config to those simple options has less utility. |
The ability to configure a different exporter(s) per signal has to be part of the configuration file as it already is covered by environment variables here: https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/sdk-environment-variables.md#exporter-selection |
@jack-berg they sort of are though. The env var is But I think I can get behind the strategy that the simple options are covered by the environment variables and the config file essentially maps directly to the SDK. |
- foo | ||
- bar | ||
# Add a simpler view, which configures the drop aggregation for instruments whose name matches "*.server.duration". | ||
- selector: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why isn't selector
under view
. Or, I suppose if it were moved to view
there would also be no need for the view
key since everything under views
would be a view
.
- selector:
instrument_name: "*.server.duration"
aggregation:
name: drop
# Using examples provided in https://github.com/MrAlias/otel-schema/issues/8 | ||
instrumentation: | ||
java: | ||
http: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is the same across python&java and could be implemented in other languages as well, so having a language-agnostic section in "instrumentation" would be something I would like seeing.
resources: | ||
resource/resourceA: | ||
# List of enabled resource detectors. Each detector has a name (FQCN for java), and an optional list of excluded keys. Detector name supports wildcard syntax. Detectors are invoked and merged sequentially. Static attributes and schema_url comprise a resource which is merged last. | ||
detectors: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sanketmehta28, @srikanthccv, please give this a look
# TODO: Replace with environment variable when parsing to avoid storing secret in plain text | ||
api-key: 1234 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# TODO: Replace with environment variable when parsing to avoid storing secret in plain text | |
api-key: 1234 | |
# TODO: Replace with environment variable when parsing to avoid storing secret in plain text | |
api-key: ${VENDOR_API_KEY} |
Would this be a possibility, like the collector config allows it today? The only concern I have is that there are already environment variables for configuration, so this might get confusing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wouldn't want that because it would mean each implementation checking values for the ${...}
pattern and then replacing it -- and would they then also have to implement defaults, ${KEY:-default}
? However, if Cue is used for the schema then users can, if they choose, use cue
to generate their yaml config from a Cue file and replace env variables.
- name: baggage | ||
# Using examples provided in https://github.com/MrAlias/otel-schema/issues/8 | ||
instrumentation: | ||
java: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Java Metric Insight has a fairly complex complication which ideally should be merged into this config file as well:
cc @PeterF778, @trask
@jack-berg yup! |
The propose configuration is a combination of the proposal in the description of #5 and of the kitchen-sink yaml. I've pulled limits, exporters, and resources out of the
sdk
block. I kept the rest of the configuration mostly intact.Fix #5