Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: in-process provider #149

Merged
merged 29 commits into from
Feb 13, 2024
Merged

feat: in-process provider #149

merged 29 commits into from
Feb 13, 2024

Conversation

bacherfl
Copy link
Contributor

Part of #139

This PR adds the support for an in-process provider, using the flagd.sync API to connect to a sync stream serving the flag configurations specified by the selector (determined by the FLAGD_SOURCE_SELECTOR environment variable).

Currently only the built-in JSONLogic operators are supported, i.e. the custom operators specified here are yet to be implemented - however, it might be better to implement this in a follow-up PR

Signed-off-by: Florian Bacher <[email protected]>
Signed-off-by: Florian Bacher <[email protected]>
Signed-off-by: Florian Bacher <[email protected]>
Signed-off-by: Florian Bacher <[email protected]>
@bacherfl
Copy link
Contributor Author

bacherfl commented Jan 31, 2024

This should be ready for a first round of reviews - @toddbaert as discussed, we can leave this PR open and create a separate PR on top of that to implement the custom JSONLogic evaluators, or merge this one without the custom evaluators and deactivate the in-process mode until the additional evaluators have been implemented - whatever you prefer :)

@bacherfl bacherfl marked this pull request as ready for review January 31, 2024 07:51
@bacherfl bacherfl requested a review from a team as a code owner January 31, 2024 07:51
@toddbaert
Copy link
Member

Thanks @bacherfl ! I will thoroughly review this in the next couple days!

@toddbaert
Copy link
Member

toddbaert commented Feb 1, 2024

Hey @bacherfl, I decided a nice way to test this would be to implement the gherkin e2e test suite here for both RPC and in-process settings. I opened a draft PR here, built on your branch (only test projects added). I was able to get the RPC provider working, but not the in-process.

You can run in-process e2e tests on this branch with:

 docker run -p 8013:9090 ghcr.io/open-feature/sync-testbed:latest

and then in a new terminal:

export FLAGD_RESOLVER_TYPE=IN_PROCESS && dotnet test test/OpenFeature.Contrib.Providers.Flagd.E2e.ProcessTest/

If you want, you can run the RPC (which pass) with:

docker run -p 8013:8013 ghcr.io/open-feature/flagd-testbed:latest 

and then in a new terminal:

export FLAGD_RESOLVER_TYPE=RPC && dotnet test test/OpenFeature.Contrib.Providers.Flagd.E2e.RpcTest/

I only got the suites working toward the end of the day so it could be something I'm missing with my config, but based on my debugging it seems like the correct port/host/resolver is set to match the testbed, and the provider seems to become ready, but evaluations fail (default).

private readonly ICache<string, object> _cache;
private int _eventStreamRetries;
private int _eventStreamRetryBackoff = EventStreamRetryBaseBackoff;
private readonly Metadata _providerMetadata = new Metadata(ProviderName);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a really good idea to refactor a lot of code out of this file, nice job. There's a lot of unused imports at the top now though.

Comment on lines 74 to 85
internal ResolverType ResolverType
{
get => _resolverType;
set => _resolverType = value;
}

internal string SourceSelector
{
get { return _sourceSelector; }
set { _sourceSelector = value; }
}

Copy link
Member

@toddbaert toddbaert Feb 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In Kubernmetes, the ENV_VAR based configs make a lot of sense, but I think we need a way to use the in-process provider configured in code code as well, so users who aren't in K8s can use whatever configuration tools they want to configure flagd directly.

I think we should either allow configuration through FlagdConfig (which would mean exposing a public constructor on it, and public fields) or add another constructor on the provider itself which takes a URL and Resolver type. Whatever is easier, I think (though I should point out we also have a public FlagdProvider constructor which takes a FlagdConfig which is sort-of useless now I think.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 for exposing a configuration-based constructor. This should be easily done through a builder. And I can point to Java [1] & go flagd [2] providers which allows both env vars and constructor paramters

[1] - https://github.com/open-feature/java-sdk-contrib/tree/main/providers/flagd#in-process-resolver
[2] - https://github.com/open-feature/go-sdk-contrib/tree/main/providers/flagd#in-process-resolver

using Grpc.Core;
using Value = OpenFeature.Model.Value;

namespace OpenFeature.Contrib.Providers.Flagd

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor - may be it's good to namespace in-process vs rpc so that they are proper access modifiers. In the long run this should help in organizing, especially with upcoming json evaluators, offline mode file readers sort of logic.


private async void HandleEvents()
{
while (_eventStreamRetries < _config.MaxEventStreamRetries)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this needs to be improved. In-process provider is special as it only have a single flag sync mechanism (currently grpc stream). If we end reconnections after few retries, then flags store will go stale and this is not a desired state. Similar to what we do in other implementations, (for example Java [1]) we should use some exponential back-off (up to a limit) and continuously re-attempt to connect till we get a shutdown request. This should help with debugging as there will be continuous errors if something is wrong with connectivity (ex:- api address change) and reconnect and server fresh flags if problem goes away

[1] - https://github.com/open-feature/java-sdk-contrib/blob/main/providers/flagd/src/main/java/dev/openfeature/contrib/providers/flagd/resolver/process/storage/FlagStore.java#L88

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also worth noting that some gRPC implementations have listener implementations so you can listen to the underlying connection state and restore the stream when the connection is restored. With this model, the stream retry options etc are used to control the parameters of the underlying connection retries, and the stream just reconnects accordingly.

Copy link
Member

@toddbaert toddbaert Feb 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like channel.WaitForStateChangedAsync could perhaps help us here, but its not available in all our targets 😩

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@toddbaert yeah, that's a good suggestion. And I hope unlike Java, .Net exposes configurations to control internal grpc reconnect configurations

var reason = Reason.Default;
if (_flags.TryGetValue(flagKey, out var flagConfiguration))
{
reason = Reason.Static;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


private async void HandleEvents()
{
while (_eventStreamRetries < _config.MaxEventStreamRetries)
Copy link
Member

@toddbaert toddbaert Feb 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By default while (_eventStreamRetries < _config.MaxEventStreamRetries) evaluates to False, since MaxEventStreamRetries is zero (this means nothing is ever read). Additionally the retries ENV VAR seem to be ignored without the cache.

EDIT: looking into it more, including the Java impl and the flagd spec, I don't think we should use this configuration var at all for the in-process impl. We should infinitely try to reconnect as @Kavindu-Dodan says. I will add this point to the flags spec

Comment on lines 48 to 52
public void Init()
{
var handleEvents = new Thread(HandleEvents);
handleEvents.Start();
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method needs to block until the connection is established, or else the setProvider returns and the READY event fires before the provider is actually READY (and indeed in might never bet ready). I would use some kind of latch or callback or something to block until we connect.

As a hacky fix you can add a Thread.sleep here for 1000 and things "work".


public void Shutdown()
{
throw new NotImplementedException();
Copy link
Member

@toddbaert toddbaert Feb 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to stop the threads and kill the connections here. As it is, with every SetProvider for this provider we have resource leaks (threads and connections).

We need to do the same thing in the RPC version.

}
}

internal RpcResolver(Service.ServiceClient client, FlagdConfig config, ICache<string, object> cache = null)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor - from quick look on the usage, this seems to be not used. If it does however, I think it's best to constructor chain to this constructor from config based constructor :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch - it is used in the unit tests, but i adapted the code to use a constructor chain :)

_evaluator = new JsonEvaluator(config.SourceSelector);
}

internal InProcessResolver(FlagSyncService.FlagSyncServiceClient client, FlagdConfig config)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor - please check if this is used (seems not per my view). And if it is, let's use constructor chaining to avoid duplicate code :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch - it is used in the unit tests, but i adapted the code to use a constructor chain :)

using OpenFeature.Error;
using OpenFeature.Model;

namespace OpenFeature.Contrib.Providers.Flagd

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor - packing could be set to Resolver.InProcess

using Grpc.Core;
using Value = OpenFeature.Model.Value;

namespace OpenFeature.Contrib.Providers.Flagd.RResolver.InProcess
Copy link

@Kavindu-Dodan Kavindu-Dodan Feb 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor - RResolver could be a typo

bacherfl and others added 2 commits February 9, 2024 07:56
Signed-off-by: Todd Baert <[email protected]>
Signed-off-by: Todd Baert <[email protected]>
Signed-off-by: Todd Baert <[email protected]>
Signed-off-by: Todd Baert <[email protected]>
Signed-off-by: Todd Baert <[email protected]>
Comment on lines +100 to +101
var response = call.ResponseStream.Current;
_evaluator.Sync(FlagConfigurationUpdateType.ALL, response.FlagConfiguration);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this was a race - we we signaling the latch before we read, which caused some issues.

Comment on lines 115 to 127
public override Task Initialize(EvaluationContext context)
{
return Task.Run(() =>
{
this._resolver.Init();
this._status = ProviderStatus.Ready;

}).ContinueWith((t) =>
{
this._status = ProviderStatus.Error;
if (t.IsFaulted) throw t.Exception;
});
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bacherfl @Kavindu-Dodan I added an initialize function, the old provider was built before this was implemented; it's no longer done in the contstructor.

@toddbaert toddbaert self-requested a review February 9, 2024 21:18
Copy link
Member

@toddbaert toddbaert left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

e2e tests are working with some minor noted changes. Thanks @bacherfl


public void Shutdown()
{
throw new System.NotImplementedException();

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should handle this as we now support init and shutdown

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've implemented this now, see commit here.

cc @bacherfl

Comment on lines 57 to 61
if (_config.CacheEnabled)
{
var handleEvents = new Thread(HandleEvents);
handleEvents.Start();
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is something interesting to consider for other implementations 🤔

Copy link

@Kavindu-Dodan Kavindu-Dodan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left few improvement comments, but this already looks promising :)

Signed-off-by: Todd Baert <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants