[Configuration Management] Allow config in config or config in code. #3808
Replies: 8 comments 1 reply
-
Prototype from @limdauto a few years ago along the same lines: https://web.archive.org/web/20210921071139/https://kedrozerotohero.com/experiments/define-data-catalog-using-python |
Beta Was this translation helpful? Give feedback.
-
@benhorsburgh also had a similar idea about marshaling / unmarshaling parameters with Pydatntic |
Beta Was this translation helpful? Give feedback.
-
Yes, this could also at the same time be used to unpack parameters into a kedro node so users have a way to avoid writing:
|
Beta Was this translation helpful? Give feedback.
-
Thanks for opening this @sheldontsen-qb ! This has come up a few times so it's good to start collecting some use cases. I'm moving this to Discussions to also link it to a similar one we had not so long ago #3788 |
Beta Was this translation helpful? Give feedback.
-
Also reposting this https://sre.google/workbook/configuration-specifics/ from @datajoely |
Beta Was this translation helpful? Give feedback.
-
Hello folks, any update on this? |
Beta Was this translation helpful? Give feedback.
-
Maybe also worth looking into OmegaConf's structured_config, although not sure of its usability in this case |
Beta Was this translation helpful? Give feedback.
-
This isn't directly related to the topic but I think it's interesting. Essentially it try to merge config in YAML and config in code. You could probably extend this via a hook or something similar that convert dataclass into the dictionary form. class CustomConfigLoader(OmegaConfigLoader):
def __init__(
self,
conf_source: str,
env: str | None = None,
runtime_params: dict[str, Any] | None = None,
*,
config_patterns: dict[str, list[str]] | None = None,
base_env: str | None = None,
default_run_env: str | None = None,
custom_resolvers: dict[str, Callable] | None = None,
merge_strategy: dict[str, str] | None = None,
):
super().__init__(
conf_source=conf_source,
env=env,
runtime_params=runtime_params,
config_patterns=config_patterns,
base_env=base_env,
default_run_env=default_run_env,
custom_resolvers=custom_resolvers,
merge_strategy=merge_strategy,
)
self["catalog"] = {**self["catalog"], **CATALOG} |
Beta Was this translation helpful? Give feedback.
-
Description
Was just thinking that it would be great to let users pick between config in code vs config in yaml. What I mean by this is:
It's not too many lines of code changed (I monkeypatched a few files just to check), but would also add another dimension of flexibility on how to use kedro. Giving option back to users to decide how they would like to manage their configuration.
Context
Right now I cannot use dataclasses to manage my configuration, which is a reasonable pattern to want to use. I've seen preferences of fully leveraging IDE, vs the mindset that there is clear separation between parameters vs code. I believe kedro should offer this type of flexibility to end users while still prescribing a preferred default.
Possible Implementation
I did monkeypatch a few files and got a small prototype working awhile back. Basically where kedro does a
catalog.load
for the string. For starters, we can allow the following:Changes can be made here:
kedro/kedro/runner/runner.py
Line 494 in 44817b8
Note the last elif basically means anything goes through, so I can define any object in any function used by a node and it cleanly gets passed through. Limiting to dataclasses also seems incomplete.
To make it flexible you could move this mapping of
object
toload_func
tosettings.py
so users can always handle this themselves instead of puttingif..elif
in the kedro codebase.Possible Alternatives
N/A
Beta Was this translation helpful? Give feedback.
All reactions