Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Put Tekton OCI bundles behind a feature flags 🎏 #3492

Merged
merged 1 commit into from
Nov 6, 2020

Conversation

vdemeester
Copy link
Member

Changes

In order to expose Tekton OCI bundle less and mark them as "alpha"
still, let's put their usage under a feature-flag. This will allow us
to experiment, refactor and enhance those without having to support
them nor expose them too much to the end-user.

Enabling the feature gates is a explicit choice for the user and is
documented to make sure users understand this is subject to changes.

This adds a new feature-flags field called "enable-tekton-oci-bundles"
that defaults to false.
If the feature-flag is off, Tekton OCI bundle are not usable. The
admission controller will disallow its usage, and the controller will
not take the bundle field into account.

Note: the e2e tests will be skip on the CI temporarly because we do
not have the framework in place to switch feature-flags during
tests. This will be worked out in parallel.

/cc @tektoncd/core-maintainers

To add a bit more context, I am experiencing pipelinerun completely stuck at resolving tasks when using Tekton bundles, that's why I feel we should put this under a feature flag. This also takes inspiration from "Support Kubernetes style feature-gates for new API fields" #3459.

Signed-off-by: Vincent Demeester [email protected]

/kind misc

Submitter Checklist

These are the criteria that every PR should meet, please check them off as you
review them:

  • Includes tests (if functionality changed/added)
  • Includes docs (if user facing)
  • Commit messages follow commit message best practices
  • Release notes block has been filled in or deleted (only if no user facing changes)

See the contribution guide for more details.

Double check this list of stuff that's easy to miss:

Reviewer Notes

If API changes are included, additive changes must be approved by at least two OWNERS and backwards incompatible changes must be approved by more than 50% of the OWNERS, and they must first be added in a backwards compatible way.

Release Notes

Introduce a feature-flag `enable-tekton-oci-bundle` to be able to use Tekton Bundle through the pipeline API

@tekton-robot tekton-robot requested a review from a team November 4, 2020 14:50
@tekton-robot tekton-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. kind/misc Categorizes issue or PR as a miscellaneuous one. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Nov 4, 2020
@@ -95,10 +95,13 @@ spec:
- /kaniko/executor
args:
- --destination=gcr.io/my-project/gohelloworld
```
``**
Copy link
Member

@chmouel chmouel Nov 4, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

formatting?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, this needs to be fixed

in [Pipelines](pipelines.md#tekton-bundles) or [TaskRuns](taskruns.md#tekton-bundles).

`Tekton Bundles` may be constructed with any toolsets that produce valid OCI image artifacts
so long as the artifact adheres to the [contract](tekton-bundle-contracts.md).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe udpate the tekton-bundle-contracts file to add it needs to have the feature flag enabled?

@bobcatfish
Copy link
Collaborator

Quick question: since using the OCI bundles is optional, I want to understand the advantage of feature flagging them? I mean if someone doesn't want to use this feature, they can simply not use it 🤔

It sounds like you mostly want to use this to indicate that the feature is alpha? I'm not sure I agree with this way of doing it - however also if we don't add the feature flag before our next release we'll be stuck, so I don't want to block it.

Feature flagging won't stop the fields from showing up in the API, it'll just make them not work - just not really sure how that helps users.

@vdemeester
Copy link
Member Author

It sounds like you mostly want to use this to indicate that the feature is alpha? I'm not sure I agree with this way of doing it - however also if we don't add the feature flag before our next release we'll be stuck, so I don't want to block it.

Yes, that's exactly it, indicate the feature is alpha.

Feature flagging won't stop the fields from showing up in the API, it'll just make them not work - just not really sure how that helps users.

Nothing can really stop the field to show up in the API but we can mark those fields as "alpha" (document them, etc..) See #3459 for more context about this (and how it's done in kubernetes too)

@vdemeester
Copy link
Member Author

From #3459 :

Some of the Tekton objects are now "beta" and hopeful "stable" in the not too distant future. When doing work in the API WG we should not be immediately graduating any new fields added to "beta" or whatever degree of stability the CRD currently is. Instead we should provide a mechanism for them to go through some sort of Alpha state until we're sure they work well.

(just to put some context here too)

@bobcatfish
Copy link
Collaborator

@vdemeester do you think it might make sense to have one flag for all of these "alpha" features?

Basically I'm worried that this feature flag will create an even worse user experience:

  • folks cant try alpha features without enabling them
  • since the features are still IN the API, they will just not work

I think I'm not totally sold on the user benefit of making this part of our API do nothing by default

@bobcatfish
Copy link
Collaborator

Maybe we can discuss in the working group today! I think there are kinda 2 different discussions:

  1. What to do about "alpha" features in beta/v1 APIs in general (which I totally agree is something we do not handle well and I also don't want to rush into a solution)
  2. Whether or not this particular feature needs to be behind a feature flag specifically (cuz it's not the first feature we've added directly to our beta API - @vdemeester i think you ran into some performance problems with the feature? i think it's a slightly different story if our reason for adding this flag is "there is a known problem with this feature we are trying to fix" vs "generally b/c it is alpha")

@vdemeester
Copy link
Member Author

@vdemeester do you think it might make sense to have one flag for all of these "alpha" features?

I am not sure nor decided, but I feel, we shouldn't hide them all under one flag, but instead have them all use their own flag (or group them by flag that make sense instead of their "status").

Basically I'm worried that this feature flag will create an even worse user experience:
* folks cant try alpha features without enabling them
* since the features are still IN the API, they will just not work
I think I'm not totally sold on the user benefit of making this part of our API do nothing by default

Well this is debatable, what is a worse use experience between having to explicitly enable a feature to be able to use a field or having a field that could be changed, removed later on. By forcing the user to explicitly enable it, we ensure that the user has the knowledge on what it implies to enable and use this feature (aka alpha, not supported, can be removed later on, …) — it allows us gather feedback on features without having to commit to it too much or too long. It allows the user to use feature in preview and have an impact on what the future holds. All of this while working on the stability of the API 🙃.

I am open to alternative though. One of it would be to have a vXalphaX API living along with the beta/stable API. In our case (and at the time of this comment), this would be having an active v1alpha2 API that replicates v1beta1 with additionnal fields (note that this is harder to implement this cleanly with only CRDs, easier to implement with a full apiserver).

On "folks cant try alpha features without enabling them", this is indeed by design. To test an alpha feature in a time where we ship a beta or stable api, the user should have to do something explicitly, whether it is by enable a feature-flag, use a vXalphaX API, …

Also, note that this is not proposing anything new, it's just following what k8s does (as explained in #3459 and here at length) – although in the case of k8s, setting feature-flags is harder (because it is directly on the apiserver command line arguments).

Maybe we can discuss in the working group today! I think there are kinda 2 different discussions:
1. What to do about "alpha" features in beta/v1 APIs in general (which I totally agree is something we do not handle well and I also don't want to rush into a solution)
2. Whether or not this particular feature needs to be behind a feature flag specifically (cuz it's not the first feature we've added directly to our beta API - @vdemeester i think you ran into some performance problems with the feature? i think it's a slightly different story if our reason for adding this flag is "there is a known problem with this feature we are trying to fix" vs "generally b/c it is alpha")

On 1., we need to decide this rather sooner than later 😅 Otherwise, anything we merge in without a feature-flag (or something to "hide it") is considered as beta (and stable once in v1 api) with the support that comes with it (deprecation, etc…).
On 2., I am not sure I agree nor disagree. I'm having issues (more than performance, some pipelinerun are just stuck forever and will only end with "timeout" or cancel) while using this feature and in general, it should also be considered as an alpha feature.

@pritidesai
Copy link
Member

On 2., I am not sure I agree nor disagree. I'm having issues (more than performance, some pipelinerun are just stuck forever and will only end with "timeout" or cancel) while using this feature and in general, it should also be considered as an alpha feature.

🙋‍♀️ I have ran into similar issues (stuck pipelinerun) while reviewing the bundle PR, havent got chance to play with it after.

@bobcatfish
Copy link
Collaborator

From working group today: thanks for discussing this! I'm happy to add this feature flag for this particular feature as part of the 0.18 release and then continue to discuss our overall strategy separately ( @vdemeester I'm thinking maybe we can try to aim to agree on a policy before 0.19? )

@vdemeester
Copy link
Member Author

From working group today: thanks for discussing this! I'm happy to add this feature flag for this particular feature as part of the 0.18 release and then continue to discuss our overall strategy separately ( @vdemeester I'm thinking maybe we can try to aim to agree on a policy before 0.19? )

Yes definitely. @afrittoli and I are gonna work on a TEP for this 😉

@@ -156,6 +140,27 @@ spec:
...
```

#### Tekton Bundles
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like we might want to design a page dedicated to alpha features eventually, or at least one that contains a link to all alpha feature docs, but not necessarily in this patch.

Copy link
Member

@afrittoli afrittoli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried running the example on a cluster that runs this PR and I get:

status:
  completionTime: "2020-11-05T18:06:17Z"
  conditions:
  - lastTransitionTime: "2020-11-05T18:06:17Z"
    message: 'error when listing tasks for taskRun remote-task-reference: tasks.tekton.dev
      "hello-world" not found'
    reason: TaskRunResolutionFailed
    status: "False"
    type: Succeeded
  podName: ""
  startTime: "2020-11-05T18:06:17Z"

This is the taskrun:

apiVersion: tekton.dev/v1beta1
kind: TaskRun
metadata:
  name: remote-task-reference
spec:
  taskRef:
    name: hello-world
    bundle: docker.io/ptasci67/example-oci@sha256:053a6cb9f3711d4527dd0d37ac610e8727ec0288a898d5dfbd79b25bcaa29828

I was expecting a validation error, but I think we normally ignore unknown fields, so I guess this should be ok?

@@ -1,9 +0,0 @@
# TODO: Move the example image to a tekton owned repo.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess we could have moved this to a skip folder, or perhaps verify that this doesn't work when the flag is not switched, but I think the machinery to run YAML tests does not support that yet?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I moved it to no-ci, forgot to commit the file 😓
Should be fixed now.

@@ -31,12 +31,14 @@ const (
disableCredsInitKey = "disable-creds-init"
runningInEnvWithInjectedSidecarsKey = "running-in-environment-with-injected-sidecars"
requireGitSSHSecretKnownHostsKey = "require-git-ssh-secret-known-hosts" // nolint: gosec
enableTektonOCIBundles = "enable-tekton-oci-bundles"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice to update the feature_flags_test.go and related test files too

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

Copy link
Member

@afrittoli afrittoli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Uhm, actually I think that the changes to taskrun_validation.go are missing?

@vdemeester
Copy link
Member Author

Uhm, actually I think that the changes to taskrun_validation.go are missing?

OMG, I forgot those, thanks, I'll update tomorrow morning.

Copy link
Collaborator

@bobcatfish bobcatfish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pritidesai we'll need to make sure to call this out clearly in the 0.18 release notes: that you have to explicitly enable this feature to use it

@@ -156,6 +140,27 @@ spec:
...
```

#### Tekton Bundles
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we add a link to this new section in the TOC at the top?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done 😉

@pritidesai
Copy link
Member

@pritidesai we'll need to make sure to call this out clearly in the 0.18 release notes: that you have to explicitly enable this feature to use it

I have updated the release notes in the PR #3142 which introduced this feature. Also one of the highlights of 0.18: "bundles behind feature flags"

@vdemeester vdemeester force-pushed the feature-flag-bundles branch 4 times, most recently from a5978ef to 9dc5abe Compare November 6, 2020 16:02
@tekton-robot
Copy link
Collaborator

The following is the coverage report on the affected files.
Say /test pull-tekton-pipeline-go-coverage to re-run this coverage report

File Old Coverage New Coverage Delta
pkg/apis/pipeline/v1beta1/pipeline_validation.go 100.0% 97.2% -2.8

In order to expose Tekton OCI bundle less and mark them as "alpha"
still, let's put their usage under a feature-flag. This will allow us
to experiment, refactor and enhance those without having to support
them nor expose them too much to the end-user.

Enabling the feature gates is a explicit choice for the user and is
documented to make sure users understand this is subject to changes.

This adds a new feature-flags field called "enable-tekton-oci-bundles"
that defaults to false.
If the feature-flag is off, Tekton OCI bundle are not usable. The
admission controller will disallow its usage, and the controller will
not take the bundle field into account.

Note: the e2e tests will be skip on the CI temporarly because we do
not have the *framework* in place to switch feature-flags during
tests. This will be worked out in parallel.

Signed-off-by: Vincent Demeester <[email protected]>
@tekton-robot
Copy link
Collaborator

The following is the coverage report on the affected files.
Say /test pull-tekton-pipeline-go-coverage to re-run this coverage report

File Old Coverage New Coverage Delta
pkg/apis/pipeline/v1beta1/pipeline_validation.go 100.0% 99.5% -0.5

@vdemeester
Copy link
Member Author

/retest

Copy link
Member

@afrittoli afrittoli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for putting this together!
The code looks good, it works fine on my kind cluster, so
/approve

@tekton-robot
Copy link
Collaborator

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: afrittoli, chmouel

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@tekton-robot tekton-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Nov 6, 2020
@vdemeester
Copy link
Member Author

/retest

@pritidesai
Copy link
Member

pritidesai commented Nov 6, 2020

Disabled by default:

-> kubectl apply -f pipeline-bundle.yaml
Error from server (BadRequest): error when creating "pipeline-bundle.yaml": admission webhook "validation.webhook.pipeline.tekton.dev" denied the request: validation failed: must not set the field(s): spec.pipelinespec.tasks[0].taskref.bundle

-> kubectl apply -f examples/v1beta1/taskruns/no-ci/tekton-bundles.yaml
Error from server (BadRequest): error when creating "examples/v1beta1/taskruns/no-ci/tekton-bundles.yaml": admission webhook "validation.webhook.pipeline.tekton.dev" denied the request: validation failed: must not set the field(s): spec.taskref.bundle

@pritidesai
Copy link
Member

Enabled with feature-flag but stuck which will be fixed in 0.19:

kubectl get tr remote-task-reference
NAME                    SUCCEEDED   REASON   STARTTIME   COMPLETIONTIME
remote-task-reference

@@ -9,6 +9,7 @@ weight: 4
- [Overview](#overview)
- [Configuring a `PipelineRun`](#configuring-a-pipelinerun)
- [Specifying the target `Pipeline`](#specifying-the-target-pipeline)
- [Tekton Bundles](#tekton-bundles)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NIT: "Specifying Tekton Bundles" 😆

@pritidesai
Copy link
Member

thanks a bunch @vdemeester for putting this together 😻
/lgtm

@tekton-robot tekton-robot added the lgtm Indicates that a PR is ready to be merged. label Nov 6, 2020
@pritidesai
Copy link
Member

/test pull-tekton-pipeline-build-tests

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. kind/misc Categorizes issue or PR as a miscellaneuous one. lgtm Indicates that a PR is ready to be merged. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants