Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow users to pass credentials through environment variables #1909

Closed
merelcht opened this issue Oct 6, 2022 · 6 comments · Fixed by #2178 or #2256
Closed

Allow users to pass credentials through environment variables #1909

merelcht opened this issue Oct 6, 2022 · 6 comments · Fixed by #2178 or #2256
Assignees

Comments

@merelcht
Copy link
Member

merelcht commented Oct 6, 2022

Description

In the new config loader (#1868), allow users to pass credentials through environment variables using the OmegaConf resolver oc.env: https://omegaconf.readthedocs.io/en/latest/custom_resolvers.html#oc-env

Context

The "default" way for Kedro to deal with credentials is through the credentials.yml file. @AntonyMilneQB has explained very thoroughly how this works and why it's problematic here: #1646

Over time many users have flagged that they want to be able to pass credentials through environment variables. Currently, this is possible if users use the TemplatedConfigLoader, but ideally this functionality would be available by default.

Insight 11 of the configuration research also showed that users are currently storing credentials in environment variables: #1847

Implementation

  • The new configloader will use OmegaConf instead of anyconf Add a new configloader with OmegaConf as replacement for anyconf  #1868

  • The built-in resolvers will be turned off

  • In this task add logic to load credentials using OmegaConf with the oc.env resolver turned on. This might mean separating the credentials loading logic so that the resolver for that loading process hasn't been turned off. If this is impossible or really difficult, we'll allow env variables for all config, but that should be discussed.

  • Document how should users pass credentials - as current way to do it is via `TemplatedConfigLoader

@merelcht merelcht added this to the Configuration overhaul milestone Oct 6, 2022
@noklam
Copy link
Contributor

noklam commented Oct 7, 2022

One question I have it, why TemplatedConfigLoader isn't the default already? If this ability to use resolver (using environment variable like a templated variable) is enabled in the basic ConfigLoader, it feels weird that a user had to switch to another class to use a template variable that isn't an environment variable.

Effectively, people can then just use the environment variable as a template variable and bypass the TemplatedConfigLoader most of the time.

Would this still work with Jinja2?

@merelcht
Copy link
Member Author

merelcht commented Oct 7, 2022

That's a great question @noklam and my preference would be to make the TemplatedConfigLoader the default (with some adjustments) for the 0.19 series. I think the main reason for it not being the default is that it was a contribution from outside of the core team. See the history of config loader classes:

Screenshot 2022-08-02 at 12 18 07

My opinion with regards to Jinja2 is that this should not be part of the default functionality, but we should allow users to make use of Jinja2 if they need to (see also insight 5 from the research #1847). The way I envision it is that the default config loader allows users to do (basic) templating and let them use environment variables for credentials. They can then extend the default (or we provide an extended version) that has full Jinja2 support, because currently TemplatedConfigLoader allows for some Jinja stuff but not all.

I will make sure to talk about this in the config workshop in a couple of weeks.

@szczeles
Copy link
Contributor

@MerelTheisenQB That's a great idea to move to the config framework that supports env variables! In kedro-kubeflow plugin we provide custom EnvTemplatedConfigLoader, for example, to fulfill a requirement of adjusting the docker image name to the current commit when running the pipeline in Kubeflow Pipelines environment (see https://github.com/getindata/kedro-kubeflow/blob/0.7.3/kedro_kubeflow/context_helper.py#L13)

Having this solved out-of-the-box by OmegaConf looks really neat!

@merelcht
Copy link
Member Author

It's great to hear you like this idea @szczeles 😄 I'll check out the custom config loader you made! Any other features you'd like to see with regards to configuration?

@szczeles
Copy link
Contributor

@MerelTheisenQB Thanks for asking! I've added 2 more comments with challenges that we faced with the current configuration parsing logic during Kedro deployments:

@merelcht
Copy link
Member Author

Left to do: add documentation to explain this functionality.

@merelcht merelcht reopened this Jan 20, 2023
@merelcht merelcht mentioned this issue Jan 23, 2023
7 tasks
@merelcht merelcht assigned merelcht and unassigned jmholzer Jan 23, 2023
@merelcht merelcht linked a pull request Jan 24, 2023 that will close this issue
5 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment