You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
kedro-org/kedro#3094 lists a number of pain points experienced by users while deploying their Kedro projects to MLOps platforms. Each kedro node is assigned to a task 1:1.
#241 added the --group-by-memory flag to make it possible to group nodes that share MemoryDatasets between them into one airflow task.
This ticket is to propose extending the grouping strategies offered by kedro-airflow
There's some strategies we can consider -
Change the design of --group-by-memory to something like --grouping-stratergy=<nodes/pipeline/memory>/--group-by=<> to take input. This will make it easy for us to add grouping strategies in the future depending on what users actually want/need.
Gather user input on what grouping strategies would be useful
The text was updated successfully, but these errors were encountered:
Grouping by non-persistent datasets is not the ideal way to determine the grouping of nodes into tasks - generating DAGs based on the state of catalog.yml has some downsides:
Catalog at the time of creating DAGs does not have to be the catalog used during deployment
A dataset being MemoryDataset can also differ based on what configuration environment we're using
Recommended grouping strategy should be namespaces but it's not widely adopted.
Follow up actions:
Release the feature introduced in feat: kedro-airflow group in memory nodes #241 as it is now, if we decide on a different way forward, we can always make a breaking change to kedro-airflow
Investigate adoption and simplification to the usage of Namespaces
Description
kedro-org/kedro#3094 lists a number of pain points experienced by users while deploying their Kedro projects to MLOps platforms. Each kedro node is assigned to a task 1:1.
#241 added the
--group-by-memory
flag to make it possible to group nodes that shareMemoryDataset
s between them into one airflow task.This ticket is to propose extending the grouping strategies offered by
kedro-airflow
There's some strategies we can consider -
Suggestion
--group-by-memory
to something like--grouping-stratergy=<nodes/pipeline/memory>
/--group-by=<>
to take input. This will make it easy for us to add grouping strategies in the future depending on what users actually want/need.The text was updated successfully, but these errors were encountered: