Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add docs on difference between OmegaConf and OmegaConfigLoader #3352

Merged
merged 8 commits into from
Nov 30, 2023
25 changes: 25 additions & 0 deletions docs/source/configuration/advanced_configuration.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@ This page also contains a set of guidance for advanced configuration requirement
* [How to ensure non default configuration files get loaded](#how-to-ensure-non-default-configuration-files-get-loaded)
* [How to bypass the configuration loading rules](#how-to-bypass-the-configuration-loading-rules)
* [How to do templating with the `OmegaConfigLoader`](#how-to-do-templating-with-the-omegaconfigloader)
* [How to load a data catalog with templating in code?](#how-to-load-a-data-catalog-with-templating-in-code)
* [How to use global variables with the `OmegaConfigLoader`](#how-to-use-global-variables-with-the-omegaconfigloader)
* [How to override configuration with runtime parameters with the `OmegaConfigLoader`](#how-to-override-configuration-with-runtime-parameters-with-the-omegaconfigloader)
* [How to use resolvers in the `OmegaConfigLoader`](#how-to-use-resolvers-in-the-omegaconfigloader)
Expand Down Expand Up @@ -133,6 +134,30 @@ Since both of the file names (`catalog.yml` and `catalog_globals.yml`) match the
#### Other configuration files
It's also possible to use variable interpolation in configuration files other than parameters and catalog, such as custom spark or mlflow configuration. This works in the same way as variable interpolation in parameter files. You can still use the underscore for the templated values if you want, but it's not mandatory like it is for catalog files.

### How to load a data catalog with templating in code?
You can use the `OmegaConfigLoader` to directly load a data catalog that contains templating in code. Under the hood the `OmegaConfigLoader` will resolve any templates, so no further steps are required to load catalog entries properly.
```yaml
# Example catalog with templating
companies:
type: ${_dataset_type}
filepath: data/01_raw/companies.csv

_dataset_type: pandas.CSVDataset
```

```python
from kedro.config import OmegaConfigLoader
from kedro.framework.project import settings

# Instantiate an `OmegaConfigLoader` instance with the location of your project configuration.
conf_path = str(project_path / settings.CONF_SOURCE)
conf_loader = OmegaConfigLoader(conf_source=conf_path)

conf_catalog = conf_loader["catalog"]
# conf_catalog["companies"]
# Will result in: {'type': 'pandas.CSVDataset', 'filepath': 'data/01_raw/companies.csv'}
```

### How to use global variables with the `OmegaConfigLoader`
From Kedro `0.18.13`, you can use variable interpolation in your configurations using "globals" with `OmegaConfigLoader`.
The benefit of using globals over regular variable interpolation is that the global variables are shared across different configuration types, such as catalog and parameters.
Expand Down
48 changes: 48 additions & 0 deletions docs/source/configuration/configuration_basics.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,28 @@

`OmegaConfigLoader` can load `YAML` and `JSON` files. Acceptable file extensions are `.yml`, `.yaml`, and `.json`. By default, any configuration files used by the config loaders in Kedro are `.yml` files.

### `OmegaConf` vs. Kedro's `OmegaConfigLoader`
`OmegaConf` is a configuration management library in Python that allows you to manage hierarchical configurations. Kedro's `OmegaConfigLoader` uses `OmegaConf` for handling configurations.
This means that when you work with `OmegaConfigLoader` in Kedro, you are using the capabilities of `OmegaConf` without directly interacting with it.

`OmegaConfigLoader` in Kedro is designed to handle more complex configuration setups commonly used in Kedro projects. It automates the process of merging configuration files, such as those for catalogs, and accounts for different environments to make it convenient to manage configurations in a structured way.

When you need to load configurations manually, such as for exploration in a notebook, you have two options:
1. Use the `OmegaConfigLoader` class provided by Kedro.
2. Directly use the `OmegaConf` library.

Kedro's `OmegaConfigLoader` is designed to handle complex project environments. If your use case involves loading only one configuration file and is straightforward, it may be simpler to use `OmegaConf` directly.

Check warning on line 29 in docs/source/configuration/configuration_basics.md

View workflow job for this annotation

GitHub Actions / vale

[vale] docs/source/configuration/configuration_basics.md#L29

[Kedro.weaselwords] 'only' is a weasel word!
Raw output
{"message": "[Kedro.weaselwords] 'only' is a weasel word!", "location": {"path": "docs/source/configuration/configuration_basics.md", "range": {"start": {"line": 29, "column": 115}}}, "severity": "WARNING"}

```python
from omegaconf import OmegaConf

parameters = OmegaConf.load("/path/to/parameters.yml")
```

When your configuration files are complex and contain credentials or templating, Kedro's `OmegaConfigLoader` is more suitable, as described in more detail in [How to load a data catalog with credentials in code?](#how-to-load-a-data-catalog-with-credentials-in-code) and [How to load a data catalog with templating in code?](advanced_configuration.md#how-to-load-a-data-catalog-with-templating-in-code).

Check notice on line 37 in docs/source/configuration/configuration_basics.md

View workflow job for this annotation

GitHub Actions / vale

[vale] docs/source/configuration/configuration_basics.md#L37

[Kedro.sentencelength] Try to keep your sentence length to 30 words or fewer.
Raw output
{"message": "[Kedro.sentencelength] Try to keep your sentence length to 30 words or fewer.", "location": {"path": "docs/source/configuration/configuration_basics.md", "range": {"start": {"line": 37, "column": 1}}}, "severity": "INFO"}

In summary, while both `OmegaConf` and Kedro's `OmegaConfigLoader` provide ways to manage configurations, your choice depends on the complexity of your configuration and whether you are working within the context of the Kedro framework.

Check notice on line 39 in docs/source/configuration/configuration_basics.md

View workflow job for this annotation

GitHub Actions / vale

[vale] docs/source/configuration/configuration_basics.md#L39

[Kedro.sentencelength] Try to keep your sentence length to 30 words or fewer.
Raw output
{"message": "[Kedro.sentencelength] Try to keep your sentence length to 30 words or fewer.", "location": {"path": "docs/source/configuration/configuration_basics.md", "range": {"start": {"line": 39, "column": 1}}}, "severity": "INFO"}

## Configuration source
The configuration source folder is [`conf`](../get_started/kedro_concepts.md#conf) by default. We recommend that you keep all configuration files in the default `conf` folder of a Kedro project.

Expand Down Expand Up @@ -86,6 +108,7 @@
* [How to change the configuration source folder at runtime](#how-to-change-the-configuration-source-folder-at-runtime)
* [How to read configuration from a compressed file](#how-to-read-configuration-from-a-compressed-file)
* [How to access configuration in code](#how-to-access-configuration-in-code)
* [How to load a data catalog with credentials in code?](#how-to-load-a-data-catalog-with-credentials-in-code)
* [How to specify additional configuration environments](#how-to-specify-additional-configuration-environments)
* [How to change the default overriding environment](#how-to-change-the-default-overriding-environment)
* [How to use only one configuration environment](#how-to-use-only-one-configuration-environment)
Expand Down Expand Up @@ -159,6 +182,31 @@
conf_catalog = conf_loader["catalog"]
```

### How to load a data catalog with credentials in code?
```{note}
We do not recommend that you load and manipulate a data catalog directly in a Kedro node. Nodes are designed to be pure functions and thus should remain agnostic of I/O.
```

Assuming your project contains a catalog and credentials file, each located in `base` and `local` environments respectively, you can use the `OmegaConfigLoader` to load these configurations, and pass them to a `DataCatalog` object to access the catalog entries with resolved credentials.

Check notice on line 190 in docs/source/configuration/configuration_basics.md

View workflow job for this annotation

GitHub Actions / vale

[vale] docs/source/configuration/configuration_basics.md#L190

[Kedro.sentencelength] Try to keep your sentence length to 30 words or fewer.
Raw output
{"message": "[Kedro.sentencelength] Try to keep your sentence length to 30 words or fewer.", "location": {"path": "docs/source/configuration/configuration_basics.md", "range": {"start": {"line": 190, "column": 1}}}, "severity": "INFO"}
```python
from kedro.config import OmegaConfigLoader
from kedro.framework.project import settings
from kedro.io import DataCatalog

# Instantiate an `OmegaConfigLoader` instance with the location of your project configuration.
conf_path = str(project_path / settings.CONF_SOURCE)
conf_loader = OmegaConfigLoader(
conf_source=conf_path, base_env="base", default_run_env="local"
)

# These lines show how to access the catalog and credentials configurations.
conf_catalog = conf_loader["catalog"]
conf_credentials = conf_loader["credentials"]

# Fetch the catalog with resolved credentials from the configuration.
catalog = DataCatalog.from_config(catalog=conf_catalog, credentials=conf_credentials)
```

### How to specify additional configuration environments
In addition to the two built-in `local` and `base` configuration environments, you can create your own. Your project loads `conf/base/` as the bottom-level configuration environment but allows you to overwrite it with any other environments that you create, such as `conf/server/` or `conf/test/`. To use additional configuration environments, run the following command:

Expand Down