Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TEP-0023: Implicit Parameter and Resource Mapping in the Pipeline specification #222

Closed
wants to merge 1 commit into from

Conversation

Peaorl
Copy link
Contributor

@Peaorl Peaorl commented Oct 1, 2020

TEP-0023 introduces implicit parameter and resource mapping in the Pipeline specification.
As per requests #3050 and #1484.

Please provided feedback on the proposal.

The algorithms in the Design section may need some tweaking as I implement it in the code.

Also see the [DISCUSSION] section.
I think it would be nice to support omitting the from parameter for PipelineTask input resources too.

@tekton-robot tekton-robot requested review from PuneetPunamiya and a user October 1, 2020 18:35
@tekton-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
To complete the pull request process, please assign sthaha
You can assign the PR to them by writing /assign @sthaha in a comment when ready.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@tekton-robot tekton-robot added the size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. label Oct 1, 2020
@Peaorl
Copy link
Contributor Author

Peaorl commented Oct 1, 2020

/cc @afrittoli

runAfter: [cleanup]
taskRef:
name: test-results
```
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice visualization of the change.


Allow for an implicit specification of a `Pipeline` wherein parameters and resources for a `PipelineTask` can be omitted.

<!-- #### User Story
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe the User Story could be:

As a Pipeline author I want the parameters and resources in a Pipeline Spec to be automatically handed to Tasks that share the parameter or resource names so that my Pipelines are easier to consume and it becomes harder to make mistakes editing the YAML in future.

|1| A `Pipeline` author can omit the explicit specification of parameters and resources for some, or any, `Task` in the `Pipeline.spec.tasks` field if:<br><br> <ul><li> The pipeline is annotated for implicit mapping <i><br>AND<br></i> <li> The omitted parameters and resources can be resolved according to Req. 2 </li></ul>|
|2| Tekton tries to resolve omitted parameters and resources for a pipeline `Task` in the following order: <br><br> <ol><li> If any preceding `Task` has a `Task` result whose name matches an implicit parameter name of the `Task`, resolve the implicit parameter value with the last produced `Task` result that matches. <li> If a parameter or resource name in the `Pipeline.spec` field matches a `Task's` implicit parameter or resource name, resolve the parameter and/or resource values. </li> <li> Tekton produces an error if parameters or resources can't be matched. <br> *\*Explicitly specified params and resources always take precedence.*

[DISCUSSION]: Additionally, omitting input resources that use the output resource of a previous task by using the [`from`](https://github.com/tektoncd/pipeline/blob/master/docs/pipelines.md#using-the-from-parameter) parameter could be supported.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To add even more possible features... we could implicitly map named workspaces in a Pipeline to named workspaces in Tasks:

# PipelineSpec
spec:
  workspaces:
  - name: source
  - name: output
  tasks:
  - taskRef: name: lint
    workspaces:
    - name: source
      workspace: source
  - taskRef: name: unit-test
    workspaces:
    - name: source
      workspace: source
  - taskRef: name: build
    workspaces:
    - name: source
      workspace: source
    - name: output
      workspace: output
  # below just shows one without a matching workspace name for reference
  - taskRef: name: integration-test
    workspaces:
    - name: binary-to-test
      workspace: output

Would become...

# PipelineSpec
spec:
  workspaces:
  - name: source
  - name: output
  tasks:
  - taskRef:
      name: lint
  - taskRef:
      name: unit-test
  - taskRef:
      name: build
  - taskRef:
      name: integration-test
    workspaces:
    - name: binary-to-test
      workspace: output

Copy link
Member

@vdemeester vdemeester left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Few questions:

  • Any reason for supporting PipelineResources in this proposal ? We aren't decide yet what we should do with them, skipping them initially would reduce the scope of work, especially if this works ends up being removed (if we remove PipelineResources as we know them now).
  • Is it a on or off feature ? aka can I pass some parameters using the params: field and except the rest to be passed implicitly ?
  • How do we handle same argument "name" with different types ?
  • How will apply handle this ? Let's assume I have an "implicit" pipeline. I create it and it get mutated to be explicit (so more parameters, …). I change some pipeline parameter and re-apply it. apply will merge the two (be default), and thus will append PipelineTask parameter to the one that got added by mutation — and will likely fail because some parameters are not given (if the do not have a default value).
  • Would this be enable through a feature-flag ?

I am all for ease the pain of users when writing pipeline and tasks. But I wonder where should this be tackled still — I don't have strong opinion on this though. The pipeline type definitions have been very explicit by design, and I wonder if it's a good or bad thing if we change that.
As of today, the proposed behaviour could be done by an external component – an external mutating webhook, a tool (generator, cli, ui, …), a higher level construct (https://github.com/shipwright-io/ at some level, …).
I am not against exploring this though, but I would rather wait to do this after the v1 API — and/or have this behaviour opt-in (aka the user have to flip a switch in the feature flag configmap).

Comment on lines 116 to 119
### Notes

- The VSCode Tekton plugin need not be updated to support implicit `Pipeline` specifications
- Potentially, the Tekton Dashboard needs to be updated to support implicit `Pipeline` specifications
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any consumer need to support this indeed, the one we "know" (tkn, dashboard, triggers, vscode tekton plugin, dashboard, openshift console, …) and the one we don't even know 🙃

Comment on lines 122 to 127
#### Common parameter and resource names
Certain parameter or resource names may be relatively common and therefore carry a different meaning between different `Tasks`.
`Tasks` with the same parameter and resource names that require different input could therefore inadvertently acquire the wrong value.
We consider it the `Pipeline` author's responsibility to guard against these errors.

#### Non-standardized parameter and resource names
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Those are big caveats imo.


#### Non-standardized parameter and resource names
Since parameter and resource names are not standardized, it may occur that a `Pipeline` parameter or resource should be passed to differently named parameters and resources across `Tasks`.
To combat this, we could propose an alias for `Pipeline` parameters and resources in order to match parameters and resources across multiple `Tasks`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We would need more detail on this aliasing of parameters. Would this be another field in the API ? Or a field on the ParamSpec ? Is this expected that users would map the param they know are different ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we add this, it would make sense to add it in the Pipeline's spec.params field and thus altering the ParamSpec. Modifying the ParamSpec would mean that both Pipeline and Task authors can specify aliases for parameters.

To restrict this feature to Pipeline authors, who'd have to know the different parameter names, we would have to separate the ParamSpec type for Pipeline and Task types.
Alternatively, we could avoid separating the ParamSpec if the webhook rejects Tasks with alias fields.

Comment on lines 133 to 134
- Tekton's validating/mutating webhook will become somewhat slower with implicit mapping because it will perform `Pipeline` specification transformations.
Nevertheless, the slowdown is expected to be negligible.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be slower because it would need to resolve the task reference before being able to validate.
Resolving means querying the kubernetes API or (when tekton bundle is in) pull an OCI image (if we use bundle in referencing the task(s)).

I definitely do not expect those slowdown to be negligible, or at least, I only have my intuiton that it might or might not be nigligible. See here for potential "slowdown" when getting the task definition (using bundles).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If extracting data for the task requires an OCI image to be pulled, it will require non-negligible amounts of time and will fail for reasons outside of Tekton's control.

Comment on lines 635 to 639
- If a *taskRef* is used, the empty *taskSpec* field could be populated with the fetched *taskRef*.
Instead of iterating over an implicit parameter or resource hashmap/array, Tekton can iterate over the parameters and resources in the *taskSpec*.
Additionally, since the webhook has updated the *taskSpec* field, the reconciler would not need to fetch the *taskRef* on every reconcile loop.
This could be considered an attractive option that would require a separate PR.
Partially because this can happen without implicit mapping as well.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would most likely be a problem on the validation side (at least compared to the current behavior). Applying the same pipeline definition (kubectl apply -f mypipeline.yaml) will result in an error because both taskRef and taskSpec are specified.

  • If we remove that validation, which one of taskSpec or taskRef has precedence ?
  • Do we overidde the taskSpec with the content of the Task referenced ? But then, the object is not what the user sent, and will always be updated on apply.
  • Do we take taskSpec and ignore taskRef ? But then, if the Task referenced changed, it will be ignored.

Partially because this can happen without implicit mapping as well.
- The mutating webhook could validate an implicit `Pipeline` specification without altering the `Pipeline` spec.
This would still require resolving parameters and resources which would additionally need to be performed by the `PipelineRun` reconciler again.
Considering the amount of duplicate work performed by Tekton this is not considered an attractive option.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does this mean ? What "duplication" are we talking about ? (in code, in execution, …)

Small quote, sometimes, duplication is better than the wrong abstraction 😛


## Drawbacks

- Implicit `Pipeline` specifications could be more error prone
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think drawbacks need more details.
In addition, the potential slowdown in the webhook should be considered as a drawback, imo.

@vdemeester vdemeester added the kind/tep Categorizes issue or PR as related to a TEP (or needs a TEP). label Oct 6, 2020
@bobcatfish
Copy link
Contributor

Spoke with @Peaorl today and this is on hold for now. Some feedback came up in API working group to address; closing this in the meantime!

@wlynch
Copy link
Member

wlynch commented Mar 19, 2021

I'm curious about exploring this proposal again - I ended up coming to a similar idea this week.

I think we may want to scope this to just parameters for the time being - unlike PipelineResources and Workspaces, parameters have a useful property where unused definitions can't affect the behavior of the Task/Pipeline.

Instead of trying to match parameters, what if we overlay them and created a merged view (e.g. don't error if there isn't a 1:1 mapping - let extra params pass through)? As in this proposal, innermost parameter definitions take precedence if there are params with the same name (i.e. plumb through anything that is missing, else use the existing values/behavior). Only embedded resources should be able to use this - you shouldn't be able to create a standalone Task in the catalog / OCI registry that relies on implicit parameters, since that would be ambiguous.

Example

User Input:

apiVersion: tekton.dev/v1beta1
kind: TaskRun
metadata:
  generateName: hello-
spec:
  params:
    - name: message
      value: "hello world!"
    - name: a
      value: asdf
  taskSpec:
    # There are no explicit params defined here. They are derived from the TaskRun above.
    steps:
    - name: default
      image: ubuntu
      script: |
        echo $(params.message)

Post-Webhook Resolution (i.e. what the controller sees):

spec:
  params:
  - name: message
    value: hello world!
  - name: a
    value: asdf
  serviceAccountName: default
  taskSpec:
    params:
    - name: message
      type: string
    # It doesn't matter that there is an extra param here -
    # this won't be replaced / resolved to anything in the steps.
    - name: a
      type: string
    steps:
    - image: ubuntu
      name: default
      resources: {}
      script: |
        echo $(params.message)

To the pipeline controller this would still look like the explicitly defined, typed schema it has always been using, but users can take advantage the shortcut to help make authoring a bit more streamlined.

I think this would address some of the ambiguity concerns @vdemeester raised, and can be done in a way that does not change the behavior of the existing API and configs.

@wlynch wlynch mentioned this pull request Jun 2, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/tep Categorizes issue or PR as related to a TEP (or needs a TEP). size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants