Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[service/extensions] extension lifecycle order #8733

Closed

Conversation

atoulme
Copy link
Contributor

@atoulme atoulme commented Oct 24, 2023

Description:
Enforce order of start and shutdown of extensions according to configuration

Link to tracking Issue:
Fixes issue #8732

Testing:
Unit tests

@atoulme atoulme requested review from a team and bogdandrutu October 24, 2023 03:53
Copy link
Member

@yurishkuro yurishkuro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@codecov
Copy link

codecov bot commented Oct 24, 2023

Codecov Report

All modified and coverable lines are covered by tests ✅

Files Coverage Δ
service/extensions/extensions.go 83.89% <100.00%> (+1.17%) ⬆️

... and 6 files with indirect coverage changes

📢 Thoughts on this report? Let us know!.

Copy link
Member

@dmitryax dmitryax left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to hold off until discussion in #8732 is resolved

@atoulme
Copy link
Contributor Author

atoulme commented Oct 25, 2023

I received a comment that I should stop extensions in reverse order that they are started, I'll push that. After that, we can discuss the merits of this PR.

@yurishkuro
Copy link
Member

Another implementation option discussed on the issue is to build support for extension dependencies in the graph/ module. Then an extension can programmatically declare a dependency on another extension and the graph would guarantee correct execution order. This would be a larger change, but preferable over explicit config order where users need to worry about listing extensions in the right order.

@dmitryax
Copy link
Member

Another implementation option discussed on the issue is to build support for extension dependencies in the graph/ module. Then an extension can programmatically declare a dependency on another extension and the graph would guarantee correct execution order. This would be a larger change, but preferable over explicit config order where users need to worry about listing extensions in the right order.

Makes sense, but it would require another predefined way to specify dependencies on the extensions in the configuration interface. The current dependencies topology is constructed directly from the pipeline YAML definition. Introducing an additional source for building the graph seems like an excessive complication of the user interface. And some extensions naturally do not provide any capabilities to other components, only to the end-user. I think it's ok to keep it simple like implemented in this PR going forward.

@yurishkuro
Copy link
Member

Introducing an additional source for building the graph seems like an excessive complication of the user interface.

Don't get me wrong, I am happy with this PR. But when we talk about complicating user interface, I think it's important to remember which users are exposed to such interface. With dependencies expressed programmatically, a very small number of users who write their own extensions are affected. With dependencies expressed via config order, all users of the collector are exposed to the notion that there are dependencies between extensions, even though there's very little they can actually do about it. E.g. jaeger_query extension simply will never work if it's placed ahead of jaeger_storage in the configuration. So in my documentation I will have to mention "always put it last", even though users didn't really need to be exposed to this accidental complexity.

@dmitryax
Copy link
Member

@yurishkuro, how do you envision a programmatic way to define the dependencies? Something like an optional interface for extensions to define a dependency on another extension interface? I believe that doesn't require integration with the pipeline graph. We would still start all the extensions before the pipelines.

@yurishkuro
Copy link
Member

how do you envision a programmatic way to define the dependencies?

I'd add an optional argument dependsOn []component.ID here:

func NewFactory(
cfgType component.Type,
createDefaultConfig component.CreateDefaultConfigFunc,
createServiceExtension CreateFunc,
sl component.StabilityLevel) Factory {

I'm not sure why these constructor APIs are currently not made extensible by taking args as a struct instead of fixed positional args.

@dmitryax
Copy link
Member

I'd add an optional argument dependsOn []component.ID here

In that case, you can only define a hardcoded dependency on a particular component. There should be a way to define a dependency on a class of extensions like storage or encoding, which is defined by an interface an extension implements.

@yurishkuro
Copy link
Member

In that case, you can only define a hardcoded dependency on a particular component. There should be a way to define a dependency on a class of extensions like storage or encoding, which is defined by an interface an extension implements.

I personally would avoid duck-typing dependencies, it's better to have explicit ones by ID. There is only one pool of extensions, so what if I have some components depending on one kind of encoding extension and other components depending on another? When they refer to those by ID there is no ambiguity, but duck-typing only works when there is exactly one instance defined for a given interface.

@dmitryax
Copy link
Member

Extensions should provide the functionality to components via interfaces like storage. Components dependent on that interface can choose any storage extension even if it's implemented outside of OTel core/contrib repos. Choosing a particular storage implementation is exposed to the user with a config option. Having a hardcoded set of IDs means that custom extensions of particular classes cannot be implemented outside, and every extension dependent on a particular class of extension has to maintain a set of all instances, e.g., each time a new storage/encoding extension is added, all dependent components have to be updated.

There is only one pool of extensions, so what if I have some components depending on one kind of encoding extension and other components depending on another?

They should be two classes of extensions that implement two different (or additional) interfaces.

@yurishkuro
Copy link
Member

@dmitryax I didn't mean that the dependency has to be hardcoded: for some extensions it may be sufficient, others may need to express their dependencies after reading their own config. That means the dependencies should not be exposed via NewFactory, but as part of CreateExtension constructor. You're right, in this case a runtime interface check is a good solution - here's a prototype #8768 (to be combined with this PR for the actual ordered start/shutdown)

@@ -77,7 +82,8 @@ func (bes *Extensions) NotifyPipelineReady() error {
func (bes *Extensions) NotifyPipelineNotReady() error {
// Notify extensions in reverse order.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this accurate?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like it's not. @atoulme can you please address this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How would you like me to address this? It feels like this comment needs to be removed.

@dmitryax
Copy link
Member

@yurishkuro, I think we can merge this one and rebase #8768 on top of it since it won't cause any breaking changes, right?

@yurishkuro
Copy link
Member

I think we can merge this one and rebase #8768 on top of it since it won't cause any breaking changes, right?

@dmitryax my PR is already based on this one (it includes the original commits). I don't have a strong objection to merging each one separately, but it means the 1st PR will introduce one way of ordering (based purely on config order) and 2nd will override that in favor of the order defined by the DependentExtension interface (with other order still being random). In theory it's possible to support both at the same time: by default start with config-based order as documentation implies, but if DependentExtension interfaces are detected in some extensions move them to the end and apply topological sort (seems a bit complicated). I think just the 2nd approach is cleaner - no order guarantees unless explicitly requested via interface.

@dmitryax
Copy link
Member

dmitryax commented Nov 1, 2023

I think just the 2nd approach is cleaner - no order guarantees unless explicitly requested via interface.

Ok, makes sense to me. Especially if that simplifies the implementation

dmitryax pushed a commit that referenced this pull request Nov 2, 2023
**Description**:
Enforce order of start and shutdown of extensions according to their
internally declared dependencies

**Link to tracking Issue**:
Resolves #8732

**Motivation**:
This is an alternative approach to #8733 which uses declaration order in
the config to start extensions. That approach (a) enforces order when
it's not always necessary to enforce, and (b) exposes unnecessary
complexity to the user by making them responsible for the order.

This PR instead derives the desired order of extensions based on the
dependencies they declare by implementing a `DependentExtension`
interface. That means that extensions that must depend on others can
expose this interface and be guaranteed to start after their
dependencies, while other extensions can be started in arbitrary order
(same as happens today because of iterating over a map).

The extensions that have dependencies have two options to expose them:
1. if the dependency is always static (e.g. `jaeger_query` extension
depending on `jaeger_storage` as in the OP), the extension can express
this statically as well, by returning a predefined ID of the dependent
extension
2. in cases where dependencies are dynamic, the extension can read the
names of the dependencies from its configuration.

The 2nd scenario is illustrated by the following configuration. Here
each complex extension knows that it needs dependencies that implement
`storage` and `encoding` interfaces (both existing APIs in collector &
contrib), but does not know statically which instances of those, the
actual names are supplied by the user in the configuration.

```yaml
extensions:
  complex_extension_1:
    storage: filestorage
    encoding: otlpencoding
  complex_extension_2:
    storage: dbstorage
    encoding: jsonencoding
  filestorage:
    ...
  dbstorage:
    ...
  otlpencoding:
  jsonencoding:
```

**Changes**:
* Introduce `DependentExtension` optional interface 
* Change `Extensions` constructor to derive the required order using a
directed graph (similar to pipelines)
* Inherited from #8733 - use new ordered list of IDs to
start/stop/notify extensions in the desired order (previously a map was
used to iterate over, which resulted in random order).
* Tests

**Testing**:
Unit tests

---------

Signed-off-by: Yuri Shkuro <[email protected]>
Co-authored-by: Antoine Toulme <[email protected]>
@dmitryax
Copy link
Member

dmitryax commented Nov 2, 2023

Superseded by #8768

@dmitryax dmitryax closed this Nov 2, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants