TEP-0060: Update proposal and mark Remote Resolution implementable #547

ghost · 2021-10-21T18:15:22Z

Remote resolution is currently in proposed state while work's been
going on to try and validate some of the alternative approaches being
put forward.

This commit moves TEP-0060 to implementable, updates the proposed
approach and adds notes about the experimental controller built to
validate ideas from the TEP.

There is an accompanying PR
in experimental with a proof-of-concept controller implementing a
couple of the alternatives presented here.

ghost · 2021-10-21T18:27:29Z

/kind tep

teps/0060-remote-resource-resolution.md

bobcatfish · 2021-10-25T16:15:53Z

/assign @afrittoli
/assign @vdemeester
/assign @bobcatfish
/assign @wlynch

bobcatfish

I really like the new syntax and the proposal overall! The best part of it imo is that if we are in agreement about the syntax, we can get this into the hands of users to get feedback - and still change the implementation (e.g. maybe we decide CRD based isn't the way to go) without being disruptive

i'm sure others will have thoughts but for my part 💯 :

/approve

bobcatfish · 2021-10-26T17:52:50Z

teps/0060-remote-resource-resolution.md

+
+Add new fields to `taskRef` and `pipelineRef` to set the resolver and
+its parameters. Access to these new fields will be locked behind the
+`alpha` feature gate.


nice, i like the new syntax!

Nice!

Storage of task and pipeline definitions could be considered as a runtime concern - a TaskRun or a PipelineRun would ask to execute the definition specified by a certain Ref.

When it comes to a Pipeline though, the definition of the Task used by the Pipeline can be considered an author time concern, and that is not well defined unless we specify a taskRef. Today the taskRef is only resolvable (unless bundles are used) - when a Pipeline is installed on a cluster - which caused a lot of debate when defining how to store Pipelines in the catalog.

Having tasks stored in git or container registries makes the definition available independently of the running cluster, and thus Pipeline definitions more self-consistent.
It may reduce Pipeline reusability, however I would argue that if someone wants to run a Pipeline from the catalog using same Tasks from a different source, that would actually be a different different pipeline, as we have no way today to identify a Task as itself globally, independent of the storage location.

We could still over time add the ability to override the taskRef resolver at runtime?

@afrittoli yeah I agree - I don't think anything we're proposing here gets in the way of us implementing a feature like taskRef rewriting down the road. Designing and implementing this in a Resolver first may help us figure out the edge cases before migrating that logic into Pipelines itself.

I've been thinking some more about this, and I think there is an issue with this syntax unless we introduce a concept of bundle and bundle resolution.

Let's consider a repo that contains contains a Pipeline that users some Tasks, defined alongside the Pipeline.
If we want to use SHAs (as opposed to tags) to references to resources, it's impossible to release / publish the Pipeline alongside the Tasks in a consistent version, since the Pipeline definition must include the version of the Tasks that are being published, and that version (in SHA format) is only known after the publishing.

This is true when the target is a container registry but it could also apply to git. If we wanted to create a git tag, we would have to make a PR to update the pipeline definition.

Possible solutions:

always release tasks and pipelines independently - however that would make the releasing workflow much more complex

always define the task spec embedded into the Pipeline, however we might not want to do that since we cannot define the same metadata with embedded specs and we might want to share the Task with other pipelines

The other option is to have a concept of bundle: if a Pipeline from a remote resource includes Tasks from the same resource, it may refer to them just by name, and the controller would look them up in the same "bundle" i.e. from the same source / version.

We might even consider decoupling the concept of "bundle" from the "container registry" storage, so that we could have bundles with different backends, such as git, object storage or else

Yep, that makes total sense! What's the process from here? Do you want to propose a new TEP to start the ball rolling on standardizing that approach?

It would also be possible for a resolver to "figure out" how to rewrite a Pipeline's YAML based on similar rules to the ones you described. E.g. consider the following pipelineRef submitted in a PipelineRun:

pipelineRef: resolver: git resource: repo: github.com/sbwsg/stuff.git path: /foo/bar/pipeline.yaml commit: abc123

The git Resolver clones the repo, checks out the commit, and reads /foo/bar/pipeline.yaml. Then it "transforms" the PipelineTask taskRefs, rewriting any that match the same repo. So this is the pipeline yaml before the transform:

kind: Pipeline # ... metadata etc ... spec: tasks: - name: get-code taskRef: resolver: git resource: repo: github.com/sbwsg/stuff.git path: /foo/baz/git-clone.yaml

And this is the pipeline yaml after the transform:

kind: Pipeline # ... metadata etc ... spec: tasks: - name: get-code taskRef: resolver: git resource: repo: github.com/sbwsg/stuff.git path: /foo/baz/git-clone.yaml commit: abc123 # <-- This has been copied from the pipelineRef to the taskRef by the git resolver

And finally the resolver returns this transformed YAML.

These kinds of features and rules can be implemented per-Resolver until Tekton Pipelines stabilizes on a common syntax for non-OCI bundles per your suggestion.

+1

We don't have to make it a mandatory feature of a resolver, but we should at least define how a resolver that supports "bundles" should behave.

bobcatfish · 2021-10-26T18:01:09Z

teps/0060-remote-resource-resolution.md

+  `TaskRun` that issued this request. If the runs are deleted, their
+  requests are cleaned up. In future this may also be useful for
+  de-duplicating requests in the same namespace for the same remote
+  Tekton resource.


bobcatfish · 2021-10-26T18:06:51Z

teps/0060-remote-resource-resolution.md

+repositories. This is new functionality that is not supported by Tekton
+Pipelines currently. In addition to basic support for fetching files
+from git this resolver will also need to support quite rich
+configuration (which can be developed over time):


👍 makes sense! i guess we can choose whether it makes sense to provide this in the request, or configure this on the resolver itself (or maybe both, with one overriding the other)

bobcatfish · 2021-10-26T18:10:52Z

teps/0060-remote-resource-resolution.md

- How will it be tested in isolation vs with other components?
+Eventually the resolution project may reach a point of maturity that
+Tekton Pipelines opts to use it for all `taskRef` and `pipelineRef`
+resolution. At this point Pipelines will need additional test coverage


at this point i'm thinking we would also start packaging some or all of the resolvers by default with Tekton Pipelines? pointing this out b/c i couldnt find a section of the TEP that mentioned this, might be worth adding somewhere (np if it's already here and i missed it haha)

Good call, I've updated with an explicit mention of bundling resolvers with pipelines if the project reaches an agreed level of maturity. This is in the Design Details section.

bobcatfish · 2021-10-26T18:11:36Z

teps/0060-remote-resource-resolution.md

+- Correctly translating all `taskRefs` and `pipelineRefs` to
+  `ResourceRequests`.
+- End-to-end behaviour of all the resolvers supported by Pipelines "out
+  of the box".


maybe it doesnt need to be mentioned but using this in our dogfooding would be a great way of testing it as well - and i think we could really use it to reduce some of our current duplication

Cheers! I've added a call out to dogfooding in the Test Plan.

bobcatfish · 2021-10-26T18:13:47Z

teps/0060-remote-resource-resolution.md

@@ -887,6 +1127,7 @@ Pros:

 Cons:
 - We'd still need to decide precisely what the protocol for resolution is.
+- `PipelineRunPending` is documented for users / third-party tools, not intended for Pipelines' internal use-cases.


another potential con is that someone who is already using this status for some other purpose wouldn't be able to also use the feature

but i think a mitigation of that (maybe something we might want to do anyway?) would be to make PipelineRunPending more like a semaphore - i.e. make it a count so you could set it to "2" to use it for both remote resolution and some other purpose; both controllers would decrement it after they've done whatever they need to do. maybe that's crazy tho haha

Agreed, I've added this as an additional con with the assumption that any decisions we make vis-a-vis synchronization primitives like semaphores would happen in a dedicated TEP.

wlynch

Super excited about this! :D

wlynch · 2021-10-26T17:31:23Z

teps/0060-remote-resource-resolution.md

+  the Pipelines codebase.
+- New approaches to resolution and resource handling can be built,
+  tested and productionized without burdening the Tekton Pipelines
+  project directly.

 ## Drawbacks


A drawback I noticed with custom tasks that also applies here is that RBAC doesn't have good support for resource subtypes. e.g. there's no way to restrict a Git Resource resolver to only have access to git types - we're relying on implementors to do the right thing and only respond to ResourceRequests of their corresponding types.

I don't think this is necessarily blocking - this is a drawback we see all over the k8s ecosystem (e.g. the Tekton Pipeline controller has full Pod management capabilities - users trust that we only ever operate on Tekton managed pods), but I wish we could do better. :(

Good point, I've added this as an additional drawback. I think there might be some interesting ways to help mitigate this problem but nothing springs to mind at the moment which would allow finer-grained RBAC to manage precisely which resolvers have access to specific types of ResourceRequest.

I wonder if some sort of duck typing could be used here 🤔

(e.g. the controller creates un/semi-structured requests based on a common schema, but these get resolved to individual apiVersion+kinds)

wlynch · 2021-10-26T17:41:24Z

teps/0060-remote-resource-resolution.md

+    type: Succeeded
+```
+
+### Resolver specifics


How are we expecting auth to be handled for ResourceRequests? My assumption is that the Resolver will have its own secrets and use those to interact with external sources, but how do we know whether a given namespace / service account / user / etc is allowed to have access to those Resources?

This will depend a bit on the requirements of each resolver. Taking Git as an example, during initial development my plan is to support configuration via a config file mounted e.g. from a ConfigMap.

The config for git resolution will allow an operator to specify global and per-repository settings that can limit (a) which repos are accessible at all ¹ and (b) which repos are allowed from specific namespaces. Here's a bit of hand-waved config to center the discussion:

allowed-repositories: - name: tekton-catalog repository_url: https://github.com/tektoncd/catalog.git - name: app-team-1-private-repo repository_url: [email protected]:app-team-1 allowed-namespaces: - app-team-1

In this example config the operator has cut off access to all repos except for the public tekton catalog and an app team's private repo. Access to the app team's private repo is limited to ResourceRequest objects created in the app-team-1 namespace. A git resolver can enforce this by virtue of the fact that ResourceRequests are namespace-scoped.

Looking a bit further forward, as part of the Workflows WG we've been discussing a Repository CRD object that could expose some or all of this configuration for use across multiple applications, not just by a git resolver but also, for example, by a controller polling a repo for changes).

Footnotes

While the scope of this discussion is around repo access ideally the available config for git resolution is going to be more granular than that. E.g. let operators disallow resources pulled from any but a limited set of subdirectories in a repo, exact file paths within a repo, pulled from any but specific commit digests, or only from signed commits (with a specific set of authors' public keys). ↩

SGTM - I'd like to dig into the config syntax a bit more (I think we would benefit by leveraging a similar syntax as https://git-scm.com/docs/gitcredentials#_credential_contexts), but happy to punt this to another discussion to keep the TEP moving.

wlynch · 2021-10-26T17:43:46Z

teps/0060-remote-resource-resolution.md

+`apply`ing or `delete`ing them.
+
+A resolver observes `ResourceRequest` objects, filtering on the
+`resolution.tekton.dev/resolver` label to find only those it is


How will users know what resolvers are available to them?
Would it be worth having a simple ResourceResolver CRD that exposes metadata of the supported resolver types on the cluster?

I like this idea a lot. It reminds me of the use of ClusterInterceptor in Triggers.

I am wondering if we can sneak up on this requirement during alpha rather than introduce an additional CRD in this TEP though. For example: what if we added a specific label or annotation on resolver deployments in the tekton-remote-resolution namespace? This way an interested party (like the Pipelines controller) who has read access to tekton-remote-resolution can query for the available resolvers but we don't need to pay the cost of specifying an additional custom resource type until we know it's useful and necessary?

wlynch · 2021-10-26T18:28:15Z

teps/0060-remote-resource-resolution.md

+The parameters for a "git" `ResourceRequest` will initially look as
+follows:


I'd highly recommend adding provider data into the params. It might even be easier to recommend separate github, gitlab, bitbucket, etc resolvers instead of a centralized git resolver (at least to start).

Defaulting to provider specific resolvers has some benefits:

Allows you to target provider specific file fetching methods (e.g. GitHub, GitLab) - this avoids needing to clone the repo entirely.

Makes self-hosted instances easier to identify and work with - i.e. how do you know what provider / auth method to use if the URL is git.example.com? Even if you're planning to use standard git protocols, small details like the username used differ per-provider.

Generally seems easier to handle provider edge cases (e.g. if users need to provide GitHub App specific identifiers)

I think there are several distinct ideas here that are worth separating out for discussion purposes:

Support provider-specific optimizations when requesting resources (e.g. use github's api to read a file if possible rather than cloning an entire repo).

Create provider-specific resolvers.

Allow users to pick which provider is used in their taskRef or pipelineRef.

Let's look at each of these individually:

Support provider-specific optimizations

I agree with this one but am not sure it necessarily implies per-provider resolvers or exposing the provider as user-facing config (vs operator- or admin-facing).

Users should specify the provider in their taskRef or pipelineRef

My assumption here is that the git repo provider is likely to be extremely static information - I don't think a repository is going to hop from being a github repo to being a gitlab repo to being a gerrit repo within a short amount of time. Given the static nature of the information I think allowing users to specify the provider of a given git repo with every taskrun or pipelinerun would end up being redundant.

In addition to that I'm not totally convinced that this is configuration you want to put in the hands of end-users. Operators or administrators, definitely, but not necessarily end-users. That's not to discount the idea entirely - just that I think we should figure out some use cases for making this configurable by "everybody".

So instead what I'd suggest is that we make this operator-level configuration supplied to the git resolver:

allowed-repositories: - name: tekton-catalog repository_url: https://github.com/tektoncd/catalog.git provider: github - name: app-team-1-private-repo repository_url: [email protected]:app-team-1 provider: github-enterprise

By default the git resolver can treat every git repo the same. When a repository is attributed to a specific provider though there can be code paths in the git resolver that can use provider-specific optimizations to fetch a resource.

But! We might not have code paths for every git repo provider under the sun which leads on to...

Provider-Specific Resolvers

I think one of the nice things about the syntax / design as proposed is that we can provide a general purpose git resolver out of the box with optimizations for the most popular providers. Someone else can in turn implement resolvers for their own niche-git-provider which use specific optimizations.

Support provider-specific optimizations when requesting resources (e.g. use github's api to read a file if possible rather than cloning an entire repo).

Note that using APIs instead of git directly (GitHub API) has its set of limitation too (rate limits, …). Also are we making the assumption that you need to do a git clone with all path and all history to get a specific commit and path from a git repository ? Because I don't think this assumption is completely valid, there is way to "clone" only the thing you are interested into (without grabing the full history of the repo nor the full content of it at a given point)

Right, I think this is also a fair stance to take! A lot of this is going to have to shake out at implementation time I think.

In terms of the TEP I think we can look at this from the perspectives of different user profiles:

Users who submit TaskRuns and PipelineRuns shouldn't have to worry about any of the nuanced detail of interacting with the repo.

Operators maybe care about some of the specifics so they can configure those things once.

Developers of git Resolvers have to really care about it and figure it all out in code, test cases and abstractions.

afrittoli · 2021-10-25T13:00:57Z

teps/0060-remote-resource-resolution.md

-### Non-Goals
+- Integrate mechanisms to verify remote tasks and pipelines
+  before they are executed by Pipelines' reconcilers. See
+  [TEP-0091](https://github.com/tektoncd/community/pull/537) on this.


teps/0060-remote-resource-resolution.md

afrittoli · 2021-11-01T15:53:21Z

teps/0060-remote-resource-resolution.md

+
+Add new fields to `taskRef` and `pipelineRef` to set the resolver and
+its parameters. Access to these new fields will be locked behind the
+`alpha` feature gate.


Nice!

Storage of task and pipeline definitions could be considered as a runtime concern - a TaskRun or a PipelineRun would ask to execute the definition specified by a certain Ref.

When it comes to a Pipeline though, the definition of the Task used by the Pipeline can be considered an author time concern, and that is not well defined unless we specify a taskRef. Today the taskRef is only resolvable (unless bundles are used) - when a Pipeline is installed on a cluster - which caused a lot of debate when defining how to store Pipelines in the catalog.

Having tasks stored in git or container registries makes the definition available independently of the running cluster, and thus Pipeline definitions more self-consistent.
It may reduce Pipeline reusability, however I would argue that if someone wants to run a Pipeline from the catalog using same Tasks from a different source, that would actually be a different different pipeline, as we have no way today to identify a Task as itself globally, independent of the storage location.

afrittoli · 2021-11-01T15:54:04Z

teps/0060-remote-resource-resolution.md

+
+Add new fields to `taskRef` and `pipelineRef` to set the resolver and
+its parameters. Access to these new fields will be locked behind the
+`alpha` feature gate.


We could still over time add the ability to override the taskRef resolver at runtime?

pritidesai · 2021-11-01T16:06:49Z

in progress, responding to feedback

wlynch · 2021-11-03T15:30:03Z

teps/0060-remote-resource-resolution.md

+    type: Succeeded
+```
+
+### Resolver specifics


SGTM - I'd like to dig into the config syntax a bit more (I think we would benefit by leveraging a similar syntax as https://git-scm.com/docs/gitcredentials#_credential_contexts), but happy to punt this to another discussion to keep the TEP moving.

wlynch · 2021-11-03T15:51:22Z

teps/0060-remote-resource-resolution.md

+  the Pipelines codebase.
+- New approaches to resolution and resource handling can be built,
+  tested and productionized without burdening the Tekton Pipelines
+  project directly.

 ## Drawbacks


I wonder if some sort of duck typing could be used here 🤔

(e.g. the controller creates un/semi-structured requests based on a common schema, but these get resolved to individual apiVersion+kinds)

wlynch · 2021-11-03T15:52:14Z

teps/0060-remote-resource-resolution.md

+`apply`ing or `delete`ing them.
+
+A resolver observes `ResourceRequest` objects, filtering on the
+`resolution.tekton.dev/resolver` label to find only those it is


afrittoli · 2021-11-15T15:35:01Z

teps/0060-remote-resource-resolution.md

+machine-readable `reason` and human-readable `message` explaining the
+failure.
+
+### 3. Create a new Tekton Resolution project


Is this now going to be a non-experimental project? If so, is the intention to keep this out of tektoncd/pipeline longterm?
I think it's fine to experiment with this a bit, but eventually I would see this living in tektoncd/pipeline directly.

Ah - good catch, this is really intended to be a production-ready project, definitely not experimental anymore, with its own project in the tektoncd org. I'll update the TEP with more detail, cheers!

I'm a bit unsure how to proceed wrt this living in tektoncd/pipeline. Would it be possible to clarify in more precise terms what you want to live in the Pipelines project? The reason I request this is that previous review cycles on this TEP (and in discussions before it was created) have really pushed back on that idea a lot, so I've intentionally directed the design away from it over the past six months. Here are a number of instances of feedback I am thinking about:

"To that end, perhaps the one missing requirement and use case from the provided list is that the remote resolution process should no longer be embedded into Tekton. Instead, the goal is to offer Tekton operators and by extension users, a pluggable framework for resolving tasks and pipelines with two reference implementations, local and Tekton Bundles."

"...in triggers for example, we refer to a set of interceptor. The mechanism used in tektoncd/pipeline could be the same/re-used in triggers."

A very detailed comment from @vdemeester about trade-offs and benefits of implementing as a new Tekton component.

The conversation thread in pipelines issue 2298 (Add support for referencing Tasks in git) that originally started a lot of this design work. The core of the question related to "where" this logic should live starts with this comment I think.

I guess my question becomes: how do we resolve this design question once and for all? (...pun intended). Should this be a separate project or should we bake this functionality directly into Tekton Pipelines?

I've updated the TEP with the production-readiness note about this new project.

I guess my question becomes: how do we resolve this design question once and for all? (...pun intended). Should this be a separate project or should we bake this functionality directly into Tekton Pipelines?

My usual answer here is : if we can do it in a separate project let's do it, and see in the future if it make sense to bake it into pipeline. The more tektoncd/pipeline mature / gets adopted, the more causious we have to be into getting things in 😅

I also tend to like the idea of keeping the responsibilities separated along project boundaries here, for a couple of reasons:

Faster iteration times bootstrapping a new project vs dumping giant PRs on the Pipelines team.

Reduce the chance of affecting other parts of Pipeline's codebase.

The whole "yes is forever" thing. We can always revisit this decision.

(edit): One of the goals is also to let this project be used by other projects. Baking into Pipelines doesn't make that goal easier I don't think.

I'll raise this in the API WG when I'm back from holiday (Nov 29). I am hopeful we can figure out how to proceed on this one with just a bit more discussion.

Feedback from the API WG just now: will this be embedded in Tekton Pipelines directly eventually? Maybe - might make sense or might not. Tekton Operator can be configured to deploy this, Pipelines itself could embed it, or it could be separate forever. In the short term keeping it separate would work, or placing it in Pipelines and extracting it out later if we decide it's a bad fit might work as well.

@afrittoli - do you have any thoughts what you'd like to see here to close this discussion item? I think this is the last open point that needs to be resolved and I'd like to keep the TEP moving if possible.

my 2 cents, either option seems fine for now, doesnt seem like it would be too hard to change later

separate project: cons: more overhead now (creating, releasing and maintaining a separate project, not clear how to bundle this easily for users

inside tekton pipelines: cons: moving into a separate project later = impacts folks who rely on it (but if ppl are relying on it, maybe we wouldnt want to move it out?

afrittoli · 2021-11-15T15:44:38Z

teps/0060-remote-resource-resolution.md

+At that point Pipelines' maintainers would need to decide how best to
+provide default resolvers "out of the box". One possible approach would be
+to deploy the `ResourceRequest` lifecycle reconciler and a set of resolvers
+as part of Pipelines' `release.yaml`.

 ## Test Plan


Until now we have not provided users with a recommended way to manage their Tasks.
I think it would be great to provide some "typical task management" examples, one per task provider, and focus some of our tests around them. We could setup different part of dogfooding with different approaches.

Awesome, I've added an additional paragraph describing this to the Dogfooding portion of the Test Plan section:

+ Once Remote Resolution is supported in Dogfooding we can also implement + testing for recommended "Task Management" approaches. Documenting these + will also provide guidelines to help the community structure their own + repos and registries of `Pipelines` and `Tasks`.

afrittoli

Thank you for this!
It looks mostly good, I have once concern still with how pipeline can reference to tasks from the same "bundle".

vdemeester · 2021-11-17T16:12:55Z

teps/0060-remote-resource-resolution.md

+machine-readable `reason` and human-readable `message` explaining the
+failure.
+
+### 3. Create a new Tekton Resolution project


I guess my question becomes: how do we resolve this design question once and for all? (...pun intended). Should this be a separate project or should we bake this functionality directly into Tekton Pipelines?

My usual answer here is : if we can do it in a separate project let's do it, and see in the future if it make sense to bake it into pipeline. The more tektoncd/pipeline mature / gets adopted, the more causious we have to be into getting things in 😅

vdemeester · 2021-11-17T16:14:35Z

teps/0060-remote-resource-resolution.md

+if any are written. See the [Design Details](#design-details) section
+for more on specific resolvers.
+
+The `ResourceRequest` Lifecycle Controller will be responsible for the


How would it be used by tektoncd/pipeline controller/machinery ? (might be documented already elsewhere right ?)

Not sure if this answers your question exactly but the plan is for Pipelines to interact with the resolution machinery through helper libraries that the tektoncd/resolver project publishes. The helpers are going to hide as much of the specifics of requesting as possible, but Pipelines may need to thread through something like a ResourceRequest client/lister. But the client / lister would also be imported from the tektoncd/resolver project - not new codegen in Pipelines.

This is described a little bit above with "Function and struct helpers for other projects to use" but I'll expand on that in the Design Details section.

To summarize my thinking very quickly, the Pipelines controllers will interact with the libraries through intermediary structs rather than "raw" CRD or HTTP request objects. So, hand-waving, the reconciler code might end up looking something like:

var errTaskStillResolving = errors.New("taskRef still resolving") // resolveTask wraps helpers from the tektoncd/resolver project to implement async // taskRef resolution in the TaskRun reconciler. func (r *reconciler) resolveTask(ctx context.Context, ref *v1beta1.TaskRef) (*v1beta1.Task, error) { if alphaFlagEnabled(ctx) { task := &v1beta1.Task{} // resp here is a struct like resolutionhelpers.Resolved{ Data: []byte, Annotations: map[string]string } or similar. if resp, err := resolutionhelpers.Resolve(ref.Resolver, ref.Resource); err != nil { if errors.Is(err, &resolutionhelpers.ErrResolutionInProgress{}) { return nil, errTaskStillResolving // non-fatal: a resource request has just been created or still in progress. } else { return nil, controller.NewPermanentError(err) } } else { if err := resolutionhelpers.Unmarshal(resp, task); err != nil { return nil, controller.NewPermanentError(err) } task.SetDefaults(ctx) // + do any other task initialization stuff return task, nil } } }

I've updated the design details section with this info (although not the hyper-specific hand-waved code example 😄 )

vdemeester · 2021-11-17T16:15:51Z

teps/0060-remote-resource-resolution.md

+_Risk_: Relying on a CRD may introduce scaling problems that couldn't be
+discovered during proof-of-concept testing. Task and Pipeline
+definitions may only get larger until a CRD no longer provides enough
+space. In busy CI/CD clusters many rapidly created large
+`ResourceRequests` may cause API server or database performance to
+degrade.


It is also, possibly, adding one more CRD for each "resource". Depending on the usage of tektoncd/pipeline, it may pressurize etcd at twice the speed 😅

I've changed the text here from "may cause API server or database performance to degrade" to call out etcd specifically: "may cause API server or etcd performance to degrade". I've also updated the potential mitigation here to indicate that any solution which is strictly better than a ResourceRequest CRD is a good alternative so long as it meets the same goals that a CRD does:

It doesn't block Pipelines' reconcilers while potentially slow resource fetches are happening.

It supports the same approach to parameters.

It has a way to re-trigger reconciles of TaskRun and PipelineRun objects when their taskRef or pipelineRefs have been successfully retrieved.

vdemeester · 2021-11-17T16:17:43Z

teps/0060-remote-resource-resolution.md

+  taskRef:
+    resolver: git
+    resource:
+      repository_url: https://github.com/tektoncd/


nit: wrong repository url 😝, it ain't a repo 👼🏼

Well spotted, fixed cheers!

vdemeester · 2021-11-17T16:36:32Z

teps/0060-remote-resource-resolution.md

+The parameters for a "git" `ResourceRequest` will initially look as
+follows:


Support provider-specific optimizations when requesting resources (e.g. use github's api to read a file if possible rather than cloning an entire repo).

Note that using APIs instead of git directly (GitHub API) has its set of limitation too (rate limits, …). Also are we making the assumption that you need to do a git clone with all path and all history to get a specific commit and path from a git repository ? Because I don't think this assumption is completely valid, there is way to "clone" only the thing you are interested into (without grabing the full history of the repo nor the full content of it at a given point)

Remote resolution is currently in proposed state while work's been going on to try and validate some of the alternative approaches being put forward. This commit moves TEP-0060 to implementable, updates the proposed approach and adds notes about the experimental controller built to validate ideas from the TEP.

ghost · 2021-12-07T18:30:56Z

Updated with an additional Risk related to data integrity and another alternative implementation. Here's the diff:

diff --git a/teps/0060-remote-resource-resolution.md b/teps/0060-remote-resource-resolution.md
index 0b10fd5..ddbad54 100644
--- a/teps/0060-remote-resource-resolution.md
+++ b/teps/0060-remote-resource-resolution.md
@@ -25,6 +25,7 @@ authors:
   - [Risks and Mitigations](#risks-and-mitigations)
     - [Relying on a CRD as storage for in-lined resolved data](#relying-on-a-crd-as-storage-for-in-lined-resolved-data)
     - [Changing the way it works means potentially rewriting multiple resolvers](#changing-the-way-it-works-means-potentially-rewriting-multiple-resolvers)
+  - [Data Integrity](#data-integrity)
   - [User Experience \(optional\)](#user-experience-optional)
     - [Simple flow: user submits `TaskRun` using public catalog `Task`](#simple-flow-user-submits-taskrun-using-public-catalog-task)
   - [Performance](#performance)
@@ -66,6 +67,8 @@ authors:
     - [Applicability For Other Tekton Projects](#applicability-for-other-tekton-projects-6)
   - [Use an Admission Controller to Perform Resolution](#use-an-admission-controller-to-perform-resolution)
     - [Applicability For Other Tekton Projects](#applicability-for-other-tekton-projects-7)
+  - [Sync Task and Pipeline objects directly into the cluster](#sync-task-and-pipeline-objects-directly-into-the-cluster)
+    - [Applicability For Other Tekton Projects](#applicability-for-other-tekton-projects-8)
 - [Open Questions](#open-questions)
 - [Future Extensions](#future-extensions)
 - [References](#references)
@@ -421,6 +424,28 @@ written to be agnostic about the "protocol". All interaction with
 clients / CRDs will be kept in shared helpers so that a rewrite only
 impacts that shared code.
 
+#### Data Integrity
+
+_Risk_: `ResourceRequest` objects have no data integrity mechanism yet, so
+a motivated actor with access to the cluster and write permissions on
+`ResourceRequest` objects can modify them without detection. This
+becomes a more notable concern when thinking about task verification
+occurring in Resolvers, as is planned in
+[TEP-0091](https://github.com/tektoncd/community/pull/537). A user with
+the necessary permissions could change a `ResourceRequest` object
+containing a Task _after_ Task verification occurred.
+
+_Possible Mitigation_: Tekton already has solutions undergoing design to address
+this problem on two fronts, and so it would make sense to integrate directly
+with one of them:
+1. [TEP-0089 SPIRE support](https://github.com/tektoncd/community/pull/529)
+where Tekton's objects (i.e. a `ResourceRequest`) can be signed by authorized
+workloads (i.e. a `ResourceRequest` Reconciler).
+2. The solution under design in TEP-0086 ([available to read
+here](https://hackmd.io/a6Kl4oS0SaOyBqBPTirzaQ)) which includes content
+addressability as a desirable property of the storage subsystem (OCI
+Registry being a good candidate).
+
 ### User Experience (optional)
 
 #### Simple flow: user submits `TaskRun` using public catalog `Task`
@@ -1221,6 +1246,34 @@ Admission controllers are a Kubernetes-wide concept. It seems reasonable to assu
 other Tekton projects could also leverage this mechanism for their own resource resolution
 needs. Possible opportunity to share controller or libraries to achieve this?
 
+### Sync Task and Pipeline objects directly into the cluster
+
+Rather than using an intermediary data format like `ResourceRequest`
+objects Resolvers could instead pull Tasks and Pipelines out of
+storage and `kubectl apply` them to the cluster to keep them in sync.
+
+Pros:
+- Syncing the contents of a repo into the cluster can happen
+  totally independently of Tekton Pipelines' knowledge.
+- Quite a bit easier to reason about than other options.
+
+Cons:
+- Unclear how this would work for pipelines-as-code use-cases where a
+  user's Pull Request could include Pipeline or Task changes that should
+  be used in testing. On-demand sync as a fallback?
+- Unclear how tasks and pipelines with the same name from multiple
+  branches or commits could co-exist in the same namespace. Potential
+  risk for confusing outcomes here.
+
+#### Applicability For Other Tekton Projects
+
+This solution would be quite specific to Tekton Pipelines if coded to
+only submit Tasks and Pipelines to the cluster. Alternatively resolvers
+could operate with zero understanding of the resources and simply
+`kubectl apply` whatever appears in the repository but this too has
+drawbacks with regard to deciding which resources should be applied and
+which shouldn't.
+
 ## Open Questions

ghost · 2021-12-13T16:44:19Z

Currently looking for a non-Googler to further review and/or approve 🙏

afrittoli

Thanks for all the replies.
I'm looking forward to these feature.
There are still things that needs to be defined, like:

support for bundles
support for validation / defaulting
support for signature validation (which can be done via admission controller for instance if those are supported)

But we can define all these as part of the new project.

/approve

ghost · 2021-12-13T16:47:10Z

Amazing, just as I was requesting the review @afrittoli nailed it, thanks a lot Andrea!

vdemeester

/lgtm

tekton-robot · 2021-12-14T09:55:52Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: afrittoli, bobcatfish, vdemeester, wlynch

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~teps/OWNERS~~ [afrittoli,bobcatfish,vdemeester,wlynch]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

ghost · 2021-12-14T14:00:52Z

Hm, is there a way to bump the EasyCLA bot?

lbernick · 2021-12-14T14:35:04Z

Hm, is there a way to bump the EasyCLA bot?

this happened to me, you can try closing and reopening the pr

ghost · 2021-12-14T14:35:44Z

/close

tekton-robot · 2021-12-14T14:35:46Z

@sbwsg: Closed this PR.

In response to this:

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

ghost · 2021-12-14T14:35:58Z

/reopen

tekton-robot · 2021-12-14T14:36:00Z

@sbwsg: Reopened this PR.

In response to this:

/reopen

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

ghost · 2021-12-14T14:39:04Z

Thanks a lot for reviewing everybody - really appreciate the amount of time you have all spent looking at this and giving feedback!

tekton-robot requested review from piyush-garg and pratap0007 October 21, 2021 18:15

tekton-robot added the size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. label Oct 21, 2021

tekton-robot added the kind/tep Categorizes issue or PR as related to a TEP (or needs a TEP). label Oct 21, 2021

afrittoli reviewed Oct 25, 2021

View reviewed changes

teps/0060-remote-resource-resolution.md Show resolved Hide resolved

tekton-robot assigned afrittoli, bobcatfish, vdemeester and wlynch Oct 25, 2021

bobcatfish reviewed Oct 26, 2021

View reviewed changes

tekton-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Oct 26, 2021

wlynch reviewed Oct 26, 2021

View reviewed changes

afrittoli reviewed Nov 1, 2021

View reviewed changes

wlynch approved these changes Nov 3, 2021

View reviewed changes

afrittoli reviewed Nov 15, 2021

View reviewed changes

bobcatfish mentioned this pull request Nov 16, 2021

TEP-0091: Verified Remote Resources #537

Closed

vdemeester reviewed Nov 17, 2021

View reviewed changes

afrittoli reviewed Dec 13, 2021

View reviewed changes

vdemeester approved these changes Dec 14, 2021

View reviewed changes

tekton-robot added the lgtm Indicates that a PR is ready to be merged. label Dec 14, 2021

tekton-robot closed this Dec 14, 2021

tekton-robot reopened this Dec 14, 2021

tekton-robot merged commit 701070e into tektoncd:main Dec 14, 2021

ghost mentioned this pull request Jan 26, 2022

Create new repo for tekton resolution #611

Closed

		The parameters for a "git" `ResourceRequest` will initially look as
		follows:

TEP-0060: Update proposal and mark Remote Resolution implementable #547

TEP-0060: Update proposal and mark Remote Resolution implementable #547

Conversation

ghost commented Oct 21, 2021

ghost commented Oct 21, 2021

bobcatfish commented Oct 25, 2021

bobcatfish left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ghost Nov 16, 2021 • edited by ghost Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

wlynch left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Footnotes

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Support provider-specific optimizations

Users should specify the provider in their taskRef or pipelineRef

Provider-Specific Resolvers

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pritidesai commented Nov 1, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ghost Nov 19, 2021 • edited by ghost Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

afrittoli left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ghost Nov 19, 2021 • edited by ghost Loading

Choose a reason for hiding this comment

ghost Nov 19, 2021 • edited by ghost Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ghost commented Dec 7, 2021

ghost commented Dec 13, 2021

afrittoli left a comment

Choose a reason for hiding this comment

ghost commented Dec 13, 2021

vdemeester left a comment

Choose a reason for hiding this comment

ghost Nov 16, 2021 •

edited by ghost

Loading

Users should specify the provider in their `taskRef` or `pipelineRef`

ghost Nov 19, 2021 •

edited by ghost

Loading

ghost Nov 19, 2021 •

edited by ghost

Loading

ghost Nov 19, 2021 •

edited by ghost

Loading