TEP-0060: Propose remote resolution experimental controller #493

ghost · 2021-08-04T16:51:23Z

This commit updates the proposed remote resolution feature with a suggestion to
develop a controller in our experimental repo to try out some of the
Alternatives.

In order to support such an experimental controller two changes are also
proposed for Tekton Pipelines that would make such a controller much easier to
develop and test.

ghost · 2021-08-04T17:35:35Z

/kind tep

pritidesai · 2021-08-09T16:21:23Z

Keeping this proposal open for a week, giving folks a chance to review if anyone is interested 🧑‍💻

bobcatfish · 2021-08-11T16:03:08Z

/assign

teps/0060-remote-resource-resolution.md

bobcatfish · 2021-08-12T16:10:53Z

teps/0060-remote-resource-resolution.md

+In order to support an experimental controller independently performing
+resolution we could add an arg to Tekton Pipelines' controller (e.g.
+`--experimental-disable-in-tree-resolution`) that completely disables the
+built-in resolution logic.


i wonder if there's an alternative where we could leverage the "pending" state - e.g. if you want to use the experimental controller, you create "pending" pipelineruns, then the experimental resolution controller can do it's thing (embedding the spec) and remove the "pending" status

Yeah agree this is an option for PipelineRuns and I considered it. We don't have that Pending feature for TaskRuns though. Should we gate this work on that being decided on and delivered first?

ah right - yeah that's kind of a pain 🤔

at the least i think the deciding part of adding "pending" for taskruns would be very brief in that the PR that was accepted for the pending tep actually was called "TEP-0015 - Add a pending setting to Tekton PipelineRun and TaskRuns" (#203) (more discussion in the PR that adds the feature tektoncd/pipeline#3522 (comment)) - so since we accepted adding pending PipelineRuns, im not sure why we wouldnt accept pending TaskRuns

I agree with you tho, making this TEP depend on that happening would be putting yet another obstacle in the path of this TEP and I'd rather accept the lesser evil of an extra config option than make it take even longer to start experimenting with this

(and as a side note, ive had this awful/wonderful idea recently that we'd be better off making the pending setting more like a semaphore - i.e. what if you have multiple controllers that need to sign off before executing a pending run? maybe there's a better way than that, e.g. a list of explicit controllers you want to process a run before it executes? LONG STORY SHORT i think you're making the right call)

(and as a side note, ive had this awful/wonderful idea recently that we'd be better off making the pending setting more like a semaphore - i.e. what if you have multiple controllers that need to sign off before executing a pending run? maybe there's a better way than that, e.g. a list of explicit controllers you want to process a run before it executes? LONG STORY SHORT i think you're making the right call)

YES. This is an excellent point. Very tricky edge case if a user wants both resolution and has their own pending processing logic.

(I've also updated the doc to include mention of this alternative approach)

bobcatfish · 2021-08-12T16:12:24Z

teps/0060-remote-resource-resolution.md

+also other Tekton projects. Having the metadata copy performed by an external
+controller doesn't make a ton of sense in that case: metadata copying is a
+feature/requirement of Tekton Pipelines, not a general requirement for
+resolution.


could you explain a bit more about this? im not totally following because:

afaik the experimental resolution controllers know enough about pipelineruns + taskruns to modify them directly, so in that sense they're not generic enough to use for anything else (maybe im missing something tho)

metadata like labels and annotations is pretty universal across k8s resources, so in that sense having some logic around this even in a very generic tool seems reasonable

The resolution controller I'm proposing for the experimental repo would aim to deliver the functionality that we've been discussing through TEP-0060, which has now largely boiled down to: (1) It should be a separate project (2) It should process the resources before Pipelines sees them and (3) it shouldn't be Pipelines-specific.

It's true that metadata is a universal feature of objects in k8s. But the behaviour where metadata is copied verbatim from Tasks to TaskRuns and Pipelines to PipelineRuns is quite specifically Tekton Pipelines I think? Let's say we open the system up in such a way that anyone can write a Resolver. Should we require their resolver to also perform this Pipelines-specific metadata copying action? I guess I'm not convinced that this is an action we want left to external resolvers to perform. Those labels / annotations can trip up the pipelines controller when they're missing and are part of our backwards-compat story I think.

But the behaviour where metadata is copied verbatim from Tasks to TaskRuns and Pipelines to PipelineRuns is quite specifically Tekton Pipelines I think?

i wonder if one of the reasons we're copying around metadata like this is because we needed a better way to query pipelineruns and taskruns - which maybe is the role that tekton results fills now? might be something we want to consider for v1

I guess I'm not convinced that this is an action we want left to external resolvers to perform.

in that case im leaning more toward the approach where we embed the entire resolved yaml document (or as much of it as we know the controller needs) 🤔

Yeah, possibly the biggest reason for the metadata duplication is about querying the cluster for PipelineRuns associated to Pipelines (and TaskRuns to Tasks). I definitely agree that Results might be a better host for this kind of relational data.

Just to throw an additional wrinkle (because I'm also generally in favor of embedding the entire yaml doc): if we go the route of embedding the entire resolved yaml I imagine there will be some amount of time where we need to store both pipelineSpec and (hypothetical) pipelineYAML in single PipelineRuns' status for compatibility's sake. In turn I wonder if this would push PipelineRuns much more quickly towards etcd's size limits when very large Pipelines are involved. wdyt?

Sorry, didn't quite finish my thought inre: metadata duplication. While querying associations is one very big use case I think there are also probably others and it would probably behoove us to enumerate those and make sure they're accounted for before we decide to drop the copying behaviour.

I imagine there will be some amount of time where we need to store both pipelineSpec and (hypothetical) pipelineYAML in single PipelineRuns' status for compatibility's sake.

hmm I'm trying to think through the scenario where we'd need to do this - all I can think of is if the controller is downgraded during pipelinerun execution?

The current controller expects to find a pipeline spec in the pipelinerun status - the updated version would expect to find the entire pipelineYAML but could also support the pipeline spec if it found it (we could wait until we're outside the compatibility window to remove that support) - but most cases wouldnt require actually storing both i dont think?

scenarios i can think of:

upgrade from "pipeline spec" controller to new version:

completed pipelineruns would contain "pipeline spec" in status, doesnt matter cuz they're done

in progress pipelineruns would be using "pipeline spec" which would be okay as long as the new version still supported it

new pipelineruns would start using "pipelineYAML"

downgrade from new version of the controller to "pipeline spec" controller

completed pipeline runs contain "pipelineYAML", doesnt matter cuz they're done (possible that on controller start up it gets upset about unexpected fields?)

in progress pipeline runs would be in trouble - the old controller wouldn't know how to read "pipelineYAML" out of their specs

new pipeline runs would use "pipeline spec"

Do we need to support the downgrade case in (2)? (specifically: support runs which execute during the update?)

I was specifically thinking of users who have written code that expects the pipelineSpec to be embedded in the status rather than scenarios where the system itself might be caught in an intermediate state. But it sounds like you don't think this is a situation we need to worry about?

OK, I've updated the proposal to pass the entire resolved yaml via a taskYaml field for taskruns and a pipelineYaml field for pipelineruns.

FYI @vdemeester and @dibyom - this is a change to the proposal so probably worth giving it one more look to confirm you're happy to keep approving.

Resolving both metadata and spec under a taskYAML or pipelineYAML field sounds good to me.

users who have written code that expects the pipelineSpec to be embedded in the status

ah that's a good point, i was only thinking about this from the controller's perspective - i always forget about status as part of the api 😅

you're probably right, it's probably best to support both - maybe we can use (even more) config options to control whether we do that or not?

bobcatfish · 2021-08-16T16:11:20Z

/assign vdemeester
/assign dibyom

ghost · 2021-08-17T20:35:24Z

The refactoring I'd started in pipelines#4108 ended up being far too invasive to deliver in a single PR. I've instead added the new resolving reconcilers behind flags in a way that's similar to what's described by this proposal. This protects pipelines' users from being exposed to the new changes until we're ready to commit to them. It also clears the way for external processes (like the experimental controller proposed here) to perform remote resolution.

jerop · 2021-08-23T16:38:47Z

needs re-approvals with the TEP changes, please take a look @tektoncd/core-maintainers

dibyom · 2021-08-23T16:51:01Z

teps/0060-remote-resource-resolution.md

+When the proposed experimental resolution controller sees an unresolved
+`TaskRun` or `PipelineRun` it will read the resource and then resolve whatever
+ref, bundle or inline document is included. The resolved `spec` will then be
+embedded in the `status.taskSpec` field for `TaskRuns` or the


Should this be status.taskYAML.spec and status.pipelineYAML.spec?

Well spotted, thank you! Fixed and pushed.

dibyom · 2021-08-23T16:52:04Z

teps/0060-remote-resource-resolution.md

+also other Tekton projects. Having the metadata copy performed by an external
+controller doesn't make a ton of sense in that case: metadata copying is a
+feature/requirement of Tekton Pipelines, not a general requirement for
+resolution.


Resolving both metadata and spec under a taskYAML or pipelineYAML field sounds good to me.

bobcatfish

looking great to me, i really like your thorough incremental approach

/approve

bobcatfish · 2021-08-24T18:26:22Z

teps/0060-remote-resource-resolution.md

+
+There are two approaches that I've considered for solutions to this:
+
+1. Change the contract such that the _entire_ resolved YAML document is passed


at the risk of making this too complicated and unreliable, i wonder if we should identify some exceptions to this right off the bat

I was just looking at a Task in the dogfooding cluster and the metadata looks like this:

I'm wondering if we want to identify a list of metadata fields and specific annotations we don't copy, e.g.:

the annotation kubectl.kubernetes.io/last-applied-configuration

managedFields

@_@ OR we start with only labels? we could pave the way for potentially embedding the entire thing by making the field able to store the entire yaml, but initially only store a certain subset that we know we need?

Updated with a note to decide precisely how we filter the copied metadata as part of this experimental phase.

Indeed, we should exclude storing manageFields and last-applied-configuration.

Updated with mention of managedFields in addition to last-applied-configuration.

tekton-robot · 2021-08-24T18:26:47Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: bobcatfish, dibyom, vdemeester

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~teps/OWNERS~~ [bobcatfish,dibyom,vdemeester]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

This commit updates the proposed remote resolution feature with a suggestion to develop a controller in our experimental repo to try out some of the Alternatives. In order to support such an experimental controller two changes are also proposed for Tekton Pipelines that would make such a controller much easier to develop and test.

bobcatfish · 2021-08-31T22:20:46Z

Looking at the working group notes, it looks like the only reason this wasn't LGTM-ed in the meeting was b/c it needed a rebase, so:

/lgtm

tekton-robot requested review from chmouel and vtereso August 4, 2021 16:51

tekton-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Aug 4, 2021

tekton-robot added the kind/tep Categorizes issue or PR as related to a TEP (or needs a TEP). label Aug 4, 2021

vdemeester approved these changes Aug 5, 2021

View reviewed changes

tekton-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Aug 5, 2021

dibyom approved these changes Aug 9, 2021

View reviewed changes

bobcatfish mentioned this pull request Aug 10, 2021

Add support for referencing Tasks in git tektoncd/pipeline#2298

Closed

tekton-robot assigned bobcatfish Aug 11, 2021

ghost commented Aug 11, 2021

View reviewed changes

teps/0060-remote-resource-resolution.md Show resolved Hide resolved

tekton-robot added needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. and removed needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. labels Aug 12, 2021

bobcatfish reviewed Aug 12, 2021

View reviewed changes

tekton-robot assigned dibyom and vdemeester Aug 16, 2021

ghost mentioned this pull request Aug 17, 2021

Add controller flag to turn off built-in resolution tektoncd/pipeline#4168

Merged

3 tasks

dibyom reviewed Aug 23, 2021

View reviewed changes

bobcatfish reviewed Aug 24, 2021

View reviewed changes

tekton-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Aug 27, 2021

tekton-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Aug 31, 2021

tekton-robot added the lgtm Indicates that a PR is ready to be merged. label Aug 31, 2021

tekton-robot merged commit bb8b823 into tektoncd:main Aug 31, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TEP-0060: Propose remote resolution experimental controller #493

TEP-0060: Propose remote resolution experimental controller #493

ghost commented Aug 4, 2021

ghost commented Aug 4, 2021

pritidesai commented Aug 9, 2021

bobcatfish commented Aug 11, 2021

bobcatfish Aug 12, 2021

ghost Aug 12, 2021

bobcatfish Aug 17, 2021

ghost Aug 17, 2021

ghost Aug 17, 2021

bobcatfish Aug 12, 2021

ghost Aug 12, 2021

bobcatfish Aug 17, 2021

ghost Aug 17, 2021

ghost Aug 17, 2021 •

edited by ghost

Loading

bobcatfish Aug 19, 2021

ghost Aug 19, 2021

ghost Aug 19, 2021

dibyom Aug 23, 2021

bobcatfish Aug 23, 2021

bobcatfish commented Aug 16, 2021

ghost commented Aug 17, 2021

jerop commented Aug 23, 2021

dibyom Aug 23, 2021

ghost Aug 23, 2021

dibyom Aug 23, 2021

bobcatfish left a comment

bobcatfish Aug 24, 2021

ghost Aug 25, 2021

vdemeester Aug 27, 2021

ghost Aug 27, 2021

tekton-robot commented Aug 24, 2021

bobcatfish commented Aug 31, 2021


		There are two approaches that I've considered for solutions to this:

		1. Change the contract such that the _entire_ resolved YAML document is passed

TEP-0060: Propose remote resolution experimental controller #493

TEP-0060: Propose remote resolution experimental controller #493

Conversation

ghost commented Aug 4, 2021

ghost commented Aug 4, 2021

pritidesai commented Aug 9, 2021

bobcatfish commented Aug 11, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ghost Aug 17, 2021 • edited by ghost Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bobcatfish commented Aug 16, 2021

ghost commented Aug 17, 2021

jerop commented Aug 23, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bobcatfish left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tekton-robot commented Aug 24, 2021

bobcatfish commented Aug 31, 2021

ghost Aug 17, 2021 •

edited by ghost

Loading