TEP-0089: Non-falsifiable provenance support #529

priyawadhwa · 2021-10-04T17:50:07Z

No description provided.

bobcatfish · 2021-10-04T17:53:29Z

/assign

teps/0089-spire-support.md

ghost

I left a bunch of questions that, I think, mostly stem from my lack of knowledge about how SPIFFE/SPIRE works.

teps/0089-spire-support.md

vdemeester · 2021-10-11T16:08:01Z

/assign @pritidesai

bobcatfish

Thanks for adding more detail! I've left some initial comments @priyawadhwa - would really like to dig into some of the requirements in more detail and get more clarity on exactly what assurances this will give us.

teps/0089-spire-support.md

bobcatfish

Thanks for adding all the additional detail @priyawadhwa, I think the overall picture is becoming a lot clearer for me!

A couple thoughts compared to the initial POC and proposal:

Sounds like we're shifting from securing specifically the results in the TaskRun to securing the entire TaskRun spec and status, is that right?
It sounds like we no longer need the individual pods to be connecting to spire - or maybe that's for solving a different problem? (in this proposal someone still modify the termination message of the pod? maybe its actually not possible to modify the termination message unless you're the kubelet...)

teps/0089-spire-support.md

bobcatfish · 2021-11-25T20:11:01Z

teps/0089-spire-support.md

+
+## Proposal
+As mentioned above, the basic design looks like this:
+1. Tekton Pipelines receives a TaskRun config, and generates the Pod for it


as an extra detail, related to my question above about modifications during execution, afaik the pipelines controller will be updating the TaskRun status as the TaskRun executes - I'm wondering if we want to be signing the TaskRun with each modification (and maybe even verifying that nothing has changed since the last modification??)

im not sure if im going too far but it seems like if we dont do that, something else could modify the taskrun status during execution, and the pipelines controller would just incrementally add to those changes, only signing at the end?

that makes sense, and would probably be good to have. i'll add in some thoughts into the proposal!

instead of signing the TaskRun with each modification, do you think it could make sense to store the current TaskRun in memory and compare it to the actual thing? maybe something like this:

store the initial TaskRun in memory

when we want to modify the running TaskRun, compare the running TaskRun to the one in memory.

if it doesn't match, then we never request an SVID and skip any SPIRE signing altogether

if it does match, then we modify the TaskRun and store the new modified version in memory, and repeat steps 2-4 until taskrun completes

teps/0089-spire-support.md

priyawadhwa · 2021-11-29T20:43:03Z

Sounds like we're shifting from securing specifically the results in the TaskRun to securing the entire TaskRun spec and status, is that right?

Yah, I think the entire spec needs to be secured to guarantee non-falsifiable provenance (since we take the results, steps, parameters and other info from the TaskRun which all ends up in provenance).

It sounds like we no longer need the individual pods to be connecting to spire - or maybe that's for solving a different problem? (in this proposal someone still modify the termination message of the pod? maybe its actually not possible to modify the termination message unless you're the kubelet...)

That's a good point! If the termination message can be modified then we'll also need pods connected to spire so that results can be secured... from the docs it isn't super clear if anyone other than the kubelet has access to the termination message. If we're not sure I think we can just play it safe and also secure results.

priyawadhwa · 2021-12-01T16:43:08Z

Made some updates! I put the termination message issue as a Risk for now, I think it should be easy enough to build signing Results on top of the work in this proposal if it's needed.

I'll try and find out if the termination message can be changed by anyone other than the kubelet in the meantime!

teps/0089-spire-support.md

imjasonh · 2021-12-01T17:20:33Z

I'll try and find out if the termination message can be changed by anyone other than the kubelet in the meantime!

Looks like they can. 😢

https://gist.github.com/imjasonh/9011395d1f88af02b5bdc5901b739090

This might be another good signal that we should stop relying on termination messages in general.

priyawadhwa · 2021-12-01T19:03:46Z

Thanks @imjasonh! I'll update the TEP.

bobcatfish

Some nits from me, and a couple of bigger things I'm wondering, about, specifically I'm wondering about including the requirement to verify the TaskRun itself.

To fulfill non-falsifiablity do we need to verify the TaskRun, or just the results in the TaskRun - and/or if we DO need to verify the TaskRun, do we need to verify pods and Pipelines instead?

If we do need to verify the TaskRun (and the PipelineRun), I'm wondering about the performance implications - keeping all executing TaskRuns and PipelineRuns in memory has a memory implication but also I think that wouldn't work b/c the controller can be restarted during execution. So I think we'd need to sign and verify the entire TaskRun (maybe PipelineRun) content on every reconcile that updates them which I'm thinking could have a significant performance impact?

Lastly, verifying TaskRuns/PipelineRuns (and/or even pods) will introduce limitations around interoperablity with other mutating admission controllers that users might want to be using (e.g. isito sidecar injection, or the SolarWinds approach to inject tasks into a Pipeline)

teps/0089-spire-support.md

dibyom · 2021-12-20T17:12:24Z

/assign @jerop

dlorenc · 2021-12-23T14:45:38Z

This looks great to me now, the wording on exactly what this guarantees is much clearer than I originally came up with :)

sudo-bmitch · 2022-01-07T22:28:41Z

One detail to consider is that Spire certificates are going to be short lived (intentionally) so if used for signing persistent data, they may be expired by the time you try to verify them. That shouldn't be a blocker, just something to realize you may need to code around (either by ignoring the expiration, having a grace period, or using a TSA).

Overall, I like where this is going, TEP looks good to me.

bobcatfish

Mostly nits from me, I think we're generally on the same page tho so nothing blocking on my side!! Thanks for all the back and forth on this

Final couple thoughts from me but not blockers:

What is involved / required to install SPIRE itself in a cluster? i'm wondering if there is anything a user would need to be aware of in order to use this feature as far as the SPIRE requirements (i dunno what this would be, maybe something like 'requires an image registry to push to' or something)
It might be more clear to rename this TEP to something like "Non-falsifiable provenance support" vs "SPIRE support" (so it's more clear to a reader who isn't familiar with SPIRE how this TEP might be relevant to them)

/approve

teps/0089-spire-support.md

bobcatfish · 2022-01-12T22:43:29Z

teps/0089-spire-support.md

+1. Enabling the alpha feature for SPIRE in Tekton
+1. Requesting an SVID & signature over Results for a TaskRun
+1. Verification of SPIRE with Chains
+1. Verify that a TaskRun that isn't created by Tekton isn't signed Chains


would you be planning to set these up as something that runs regularly or something that's done manually to verify the behavior? I think these would be valuable as end to end tests we keep going forward, not 100% sure if we'd prefer to always have spire enabled in our end to end tests (and deployed and configured in the cluster under test) or only do it for a sub set of tests

we'd need to setup a test cluster that has both chains + pipelines in it as well (maybe chains already has this?)

ideally it would be something that runs regularly but i'll probably know more once i start actually writing the code 😅

i think we will need to set up a new test cluster, chains doesn't have one as of now!

teps/0089-spire-support.md

wlynch · 2022-01-13T15:40:58Z

teps/0089-nonfalsifiable-provenance-support.md

+
+### 2. Tekton Pipelines can't verify that the results it reads weren't modified
+The solution to this is Signed Results.
+We will modifiy the entrypointer image to sign results with SPIRE once they're available.


Do you know how fine grained authorization is for the signing keys? e.g. does it allow per-container authorization or is it only for the Pod? I'm wondering if/how we might be able to reduce access to keys from user steps 🤔

There are a few different selectors you can configure, including the container image and container name, so I believe we should be able to reduce access from user steps.

vdemeester

Overall sounds good, just few questions:

This could be an opt-in or opt-out feature right ?
If that's the case, I would like to explore the possible problem that could arise, like CSI driver not being present, etc..
On the "release" problem if this is enabled or not (because of the need to have a volume mounted in the controller), the operator could help

teps/0089-nonfalsifiable-provenance-support.md

priyawadhwa · 2022-01-14T19:56:48Z

Thanks @bobcatfish, and @vdemeester! To answer some of your questions:

What is involved / required to install SPIRE itself in a cluster?

I might need to look into this a little more, but I definitely think we'll need to add this to any documentation around this feature. In addition to the k8s yaml described in the docs, it looks like users will need to install the k8s node attestor which requires some setup within the cluster.

It might be more clear to rename this TEP to something like "Non-falsifiable provenance support"

SGTM

This could be an opt-in or opt-out feature right ?

Yup, it'll be opt-in as an experimental feature.

If that's the case, I would like to explore the possible problem that could arise, like CSI driver not being present, etc..

Right now, I'm just planning to clearly document the required tools users will need to install before using the feature. We can definitely consider better ways of handling potential errors in the future.

On the "release" problem if this is enabled or not (because of the need to have a volume mounted in the controller), the operator could help

That's a good idea! I'll add that into the TEP as a potential option for releasing (@bobcatfish had also suggested doing two separate releases, one with the volume and one without the volume).

bobcatfish · 2022-01-14T20:59:03Z

sgtm! @jerop and @pritidesai i think you are currently the outstanding approvers on this one

vdemeester · 2022-01-14T21:09:59Z

Thanks @priyawadhwa , LGTM. We have time to figure things out 😇

jerop

Thanks @priyawadhwa!

"Marking a TaskRun as failed" has a specific meaning right now - that we mark ConditionSucceeded as false - so it'd be great to clarify in the proposal that it means adding an annotation to the TaskRun indicating that the verification of the TaskRun failed (understood this is what was meant from reading other discussions here, that it's not really failing the TaskRun)

Also had a few questions below

(Please squash the commits)

teps/0089-nonfalsifiable-provenance-support.md

pritidesai · 2022-01-15T01:04:10Z

teps/0089-nonfalsifiable-provenance-support.md

+As mentioned above, the basic design looks like this (this is meant to be high level and still needs to be fleshed out a bit):
+1. Tekton Pipelines receives a TaskRun config, and generates the Pod for it with SPIRE mounted in
+1. The Pod executes, and the entrypointer requests an SVID and signature over the Results
+1. Tekton Pipelines verifies the Results


are we referring to task results here? Task results are part of the taskRun status. So the entrypoint is requesting SVID and signature for that particular result? lets say if my task has 10 different results, will I see those many SVIDs and signatures? Are these SVIDs and signatures also stored as annotations in addition to the SVIDs and signature of the taskRun itself?

SVID for a taskRun could be spiffe://tekton.dev/<taskrun> and for task result it could be spiffe://tekton.dev/<taskrun>/result/<result> or spiffe://tekton.dev/<taskrun>/<result> 🤔

Yup, we're referring to task results. Each result will have an associated signature, and all signatures can be verified with one SVID. These will be stored as annotations.

Are these SVIDs and signatures also stored as annotations in addition to the SVIDs and signature of the taskRun itself?

Yup, ultimately we will have signatures and an SVID for the Task results, and we'll also have a signature and an SVID over the entire TaskRun.

pritidesai · 2022-01-15T01:09:57Z

teps/0089-nonfalsifiable-provenance-support.md

+1. Tekton Pipelines receives a TaskRun config, and generates the Pod for it with SPIRE mounted in
+1. The Pod executes, and the entrypointer requests an SVID and signature over the Results
+1. Tekton Pipelines verifies the Results
+1. Meanwhile, Tekton Pipelines has been verifying that the TaskRun hasn't been modified during execution


please elaborate a little on TaskRun, I think TaskRun here means task specifications (including input params) either through taskRef or taskSpec 🤔 Or does it also include status since status is modified by the pipeline controller until the execution is over.

i basically meant it to include the entire TaskRun yaml that's being executed, and i'm guessing the status should be included as well.

teps/0089-nonfalsifiable-provenance-support.md

pritidesai · 2022-01-15T01:30:25Z

thanks @priyawadhwa for this proposal, excited to see this moving forward 🎉

please squash all the commits, add a little description in the PR, and update the last-updated in the TEP.

The proposal is very high level at this point like you mentioned. I would love to see more details when we move this TEP to implementable specifically adding examples and some visuals if possible 🙏

/approve

tekton-robot · 2022-01-15T01:30:32Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: bobcatfish, jerop, pritidesai, vdemeester

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~teps/OWNERS~~ [bobcatfish,jerop,pritidesai,vdemeester]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

priyawadhwa · 2022-01-18T18:58:09Z

Thanks @jerop and @pritidesai! I just squashed the commits and added in some more details based on your comments.

bobcatfish · 2022-01-18T20:49:38Z

We've got approvals from all the approvers so I'm going to go ahead and lgtm this so it will be merged as proposed - @priyawadhwa the next step would be to open a PR to update the status to "implementable" - @pritidesai mentioned including some more examples and visuals as part of that PR as well (#529 (comment)) - maybe an architecture diagram and/or a sequence diagram?

/lgtm

This change fixes the date on TEP-0089 to pass the PipelineRun on open TEPs. The `pull-community-teps-lint` PipelineRun did not run when we merged TEP-0089 in tektoncd#529. It got merged with a linting issue - wrong date - that's causing the same PipelineRun to fail in open TEPs.

This change fixes the date on TEP-0089 to pass the PipelineRun on open TEPs. The `pull-community-teps-lint` PipelineRun did not run when we merged TEP-0089 in #529. It got merged with a linting issue - wrong date - that's causing the same PipelineRun to fail in open TEPs.

tekton-robot requested review from AlanGreene and hrishin October 4, 2021 17:50

tekton-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Oct 4, 2021

tekton-robot assigned bobcatfish Oct 4, 2021

bobcatfish reviewed Oct 4, 2021

View reviewed changes

teps/0089-spire-support.md Outdated Show resolved Hide resolved

ghost reviewed Oct 6, 2021

View reviewed changes

teps/0089-spire-support.md Outdated Show resolved Hide resolved

teps/0089-spire-support.md Outdated Show resolved Hide resolved

teps/0089-spire-support.md Outdated Show resolved Hide resolved

vdemeester reviewed Oct 7, 2021

View reviewed changes

teps/0089-spire-support.md Outdated Show resolved Hide resolved

teps/0089-spire-support.md Outdated Show resolved Hide resolved

priyawadhwa commented Oct 8, 2021

View reviewed changes

teps/0089-spire-support.md Outdated Show resolved Hide resolved

vdemeester added the kind/tep Categorizes issue or PR as related to a TEP (or needs a TEP). label Oct 11, 2021

tekton-robot assigned pritidesai Oct 11, 2021

tekton-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Oct 17, 2021

bobcatfish reviewed Oct 25, 2021

View reviewed changes

tekton-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Nov 18, 2021

priyawadhwa force-pushed the spire-support branch from 81181aa to 0c33dda Compare November 18, 2021 14:39

bobcatfish reviewed Nov 25, 2021

View reviewed changes

pxp928 reviewed Dec 1, 2021

View reviewed changes

teps/0089-spire-support.md Outdated Show resolved Hide resolved

priyawadhwa force-pushed the spire-support branch from fbb8fd6 to f89d048 Compare December 8, 2021 22:31

bobcatfish reviewed Dec 13, 2021

View reviewed changes

wlynch reviewed Dec 14, 2021

View reviewed changes

teps/0089-spire-support.md Outdated Show resolved Hide resolved

priyawadhwa force-pushed the spire-support branch 4 times, most recently from 3b01fcd to 557f32a Compare December 15, 2021 01:34

tekton-robot assigned jerop Dec 20, 2021

priyawadhwa mentioned this pull request Jan 4, 2022

Node attestation and workload attestation tektoncd/chains#244

Closed

bobcatfish reviewed Jan 12, 2022

View reviewed changes

tekton-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jan 12, 2022

priyawadhwa changed the title ~~TEP-0089: SPIRE support~~ TEP-0089: Non-falsifiable provenance support Jan 13, 2022

wlynch reviewed Jan 13, 2022

View reviewed changes

vdemeester approved these changes Jan 14, 2022

View reviewed changes

teps/0089-nonfalsifiable-provenance-support.md Outdated Show resolved Hide resolved

jerop approved these changes Jan 14, 2022

View reviewed changes

teps/0089-nonfalsifiable-provenance-support.md Outdated Show resolved Hide resolved

teps/0089-nonfalsifiable-provenance-support.md Outdated Show resolved Hide resolved

teps/0089-nonfalsifiable-provenance-support.md Show resolved Hide resolved

pritidesai reviewed Jan 15, 2022

View reviewed changes

teps/0089-nonfalsifiable-provenance-support.md Show resolved Hide resolved

Add TEP for non-falsifiable provenance in Tekton

a22f8a7

priyawadhwa force-pushed the spire-support branch from 40b4be3 to a22f8a7 Compare January 18, 2022 18:57

tekton-robot added the lgtm Indicates that a PR is ready to be merged. label Jan 18, 2022

tekton-robot merged commit 7810fa6 into tektoncd:main Jan 18, 2022

priyawadhwa deleted the spire-support branch January 18, 2022 20:55

jerop mentioned this pull request Jan 27, 2022

Fix TEP Linting #614

Merged

TEP-0089: Non-falsifiable provenance support #529

TEP-0089: Non-falsifiable provenance support #529

Conversation

priyawadhwa commented Oct 4, 2021 • edited Loading

bobcatfish commented Oct 4, 2021

ghost left a comment

Choose a reason for hiding this comment

vdemeester commented Oct 11, 2021

bobcatfish left a comment

Choose a reason for hiding this comment

bobcatfish left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

priyawadhwa commented Nov 29, 2021

priyawadhwa commented Dec 1, 2021

imjasonh commented Dec 1, 2021

priyawadhwa commented Dec 1, 2021

bobcatfish left a comment

Choose a reason for hiding this comment

dibyom commented Dec 20, 2021

dlorenc commented Dec 23, 2021

sudo-bmitch commented Jan 7, 2022

bobcatfish left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

vdemeester left a comment

Choose a reason for hiding this comment

priyawadhwa commented Jan 14, 2022

bobcatfish commented Jan 14, 2022

vdemeester commented Jan 14, 2022

jerop left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pritidesai Jan 15, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pritidesai commented Jan 15, 2022

tekton-robot commented Jan 15, 2022

priyawadhwa commented Jan 18, 2022

bobcatfish commented Jan 18, 2022

priyawadhwa commented Oct 4, 2021 •

edited

Loading

jerop left a comment •

edited

Loading

pritidesai Jan 15, 2022 •

edited

Loading