Decide new name for "reused pipelines with namespaces" #4016

astrojuanlu · 2024-07-17T09:55:50Z

Description

Come up with a name for

pipeline(
    base_data_science, 
    namespace = "ds_2",
    parameters={"params:model_options": "params:model_options_2"},
    inputs={"model_input_table"},
)

(taken from https://docs.kedro.org/en/latest/nodes_and_pipelines/namespaces.html#what-is-a-namespace)

This is a child task of #2723. After coming up with the right name, we should adjust our documentation and training materials.

Context

People often refer to the above as "modular pipelines", even though we have already established that this is an abuse of terminology #2723 (in fact, one can have pipelines with namespaces that aren't modular, and modular pipelines that don't use namespaces)

In #3948 we reworked the docs and we already got signals from users that they discovered namespaces thanks to it! https://linen-slack.kedro.org/t/22686809/this-is-exciting-that-s-all-https-docs-kedro-org-en-latest-n#c2f45225-8c17-4153-97b7-ae974a779c27

This shows the importance of properly naming things so that they can be described and taught properly.

Also notice that users can reuse pipelines without specifying namespaces, as demonstrated in the first code snippet of https://docs.kedro.org/en/latest/nodes_and_pipelines/namespaces.html

    return pipeline(
        existing_pipeline, # Name of the existing Pipeline object
        inputs = {"old_input_df_name" : "new_input_df_name"},  # Mapping existing Pipeline input to new input
        outputs = {"old_output_df_name" : "new_output_df_name"},  # Mapping existing Pipeline output to new output
        parameters = {"params: model_options": "params: new_model_options"},  # Updating parameters
    )

The docs clarify that doing this is kind of useless though, because "In Kedro, you cannot run pipelines with the same node names". So this "pipeline inheritance" (?) plus the concept of namespaces is what enables actual pipeline reuse.

Possible Implementation

"Nested pipelines" (inheritance + namespace = nesting, and it also makes it visually clear what happens)

Possible Alternatives

"Namespaced pipelines"
"Namespace pipelines"
"Sub pipelines"
"Reused pipelines"
Just "namespaces" (although I consider this an abuse of terminology too)
❓

DimedS · 2024-07-17T13:42:37Z

Thanks for the issue, @astrojuanlu. I agree that we should continue to clarify in our docs how to better use namespaces. For that, we should take into account what @idanov mentioned in the recent demo: namespaces are not only a way to reuse pipelines (as clarified in the recent docs update) but also a way to better structure pipelines. This aspect is currently not well covered but is valuable for users in terms of visualisation and deployment. Therefore, it's a good point that we currently have a dedicated page for namespaces. I believe the page name should focus on how namespaces help reuse and better structure pipelines.

yury-fedotov · 2024-07-19T01:24:53Z

At least how I see this, all below is just my mental model.

I like to think that there are 3 fundamental archetypes of Pipeline objects I'm creating. They all are instances of Pipeline but obtained very differently:

Abstract templates. These are never registred themselves, never point to actual catalog entries, and only serve to be used with pipeline() wrapper.
Namespaced instances of abstract templates. Those are results of applying pipeline() wrapper on abstract templates. They are registred, and whole idea is to reuse abstract template but point it to actual catalog entries.
Nonmodular pipelines. Good example is this. Those point to specific catalog entries right away (in node definitions) and aren't intended to leverage namespaces. They operate at root namespace of the project.

astrojuanlu added the Issue: Feature Request New feature or improvement to existing feature label Jul 17, 2024

astrojuanlu mentioned this issue Jul 17, 2024

Rectify "modular pipelines" terminology #2723

Open

github-actions bot mentioned this issue Aug 1, 2024

Monthly issue metrics report #4049

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Decide new name for "reused pipelines with namespaces" #4016

Decide new name for "reused pipelines with namespaces" #4016

astrojuanlu commented Jul 17, 2024

DimedS commented Jul 17, 2024

yury-fedotov commented Jul 19, 2024

Decide new name for "reused pipelines with namespaces" #4016

Decide new name for "reused pipelines with namespaces" #4016

Comments

astrojuanlu commented Jul 17, 2024

Description

Context

Possible Implementation

Possible Alternatives

DimedS commented Jul 17, 2024

yury-fedotov commented Jul 19, 2024