use workload identity for all prow integrations #180

listx · 2020-02-13T23:36:37Z

This way, we don't have to give service account creds to Prow admins.

listx · 2020-02-13T23:36:47Z

fejta · 2020-02-13T23:38:13Z

Do you know anywhere that you are explicitly requiring a secret.json file? If not then we can just remove it and validate things work

fejta · 2020-02-13T23:38:43Z

ref kubernetes/test-infra#15806

listx · 2020-02-14T00:18:42Z

Here are the promoter's Prow jobs that use volume-mounted creds:

fejta · 2020-02-14T18:59:31Z

What happens when this flag is missing?

https://github.com/kubernetes/test-infra/blob/97c0e7b13dda19b6a34bf04140151b7ef8d27fbb/config/jobs/kubernetes/test-infra/test-infra-trusted.yaml#L688

listx · 2020-02-14T21:08:00Z

Then the service account creds are never loaded. This means go-containerregistry will have to fall back to some other auth mechanism (presumably).

I am not sure how WI interacts with go-containnerregistry's Copy() function which the promoter uses to push images.

jonjohnsonjr · 2020-02-14T23:35:47Z

I am not sure how WI interacts with go-containnerregistry's Copy() function which the promoter uses to push images.

That loads credentials from the docker config file. I don't have much context for this issue but assuming you're doing something like:

$ gcloud auth configure-docker

... to wire up gcloud as a config helper, this might work? I'm not sure if gcloud's credential helper actually works with workload identity.

Here's a primer on docker auth: https://github.com/google/go-containerregistry/blob/master/pkg/authn/README.md

If you can get a shell in one of these pods where you haven't activated a secret.json file, you could try:

echo "gcr.io" | docker-credential-gcloud get

to see if that returns valid credentials. You should get an access token as the Secret.

This will check the token you get back:

 curl "https://www.googleapis.com/oauth2/v1/tokeninfo?access_token=$(echo 'gcr.io' | docker-credential-gcloud get | jq .Secret -r)"

If that doesn't work there are definitely things we can do to work around this, let me know.

fejta · 2020-02-14T23:40:37Z

Can you point me to the code that does this? I've mostly just had to do:

Only call gcloud auth activate-service-account to a file if the env variable is set
Always call gcloud auth configure-docker even when the env isn't set

Example: kubernetes/test-infra@966711a#diff-9f51d141e18f12470d41311d7ced5631

jonjohnsonjr · 2020-02-19T19:54:46Z

Can you point me to the code that does this?

The crane package uses authn.DefaultKeychain by default: https://github.com/google/go-containerregistry/blob/4336215636f7ace860f1e499cf5033d12073a44b/pkg/crane/options.go#L33

Which is just a thin wrapper around github.com/docker/cli/cli/config, which fetches credentials based on your config file as you would expect (i.e. it works with gcloud auth configure-docker).

From the linked diff things should just work, assuming gcloud has been set up appropriately.

fejta · 2020-02-19T23:54:43Z

Linus, do you have opinions about the way you want these jobs to run gcloud auth configure-docker?

It isn't obvious to me that google/cloud-sdk has comes preconfigured for docker. Most obvious solution would be to add a layer that adds this file to /root/.docker/config.json:

{
  "credHelpers": {
    "gcr.io": "gcloud",
    "us.gcr.io": "gcloud",
    "eu.gcr.io": "gcloud",
    "asia.gcr.io": "gcloud",
    "staging-k8s.gcr.io": "gcloud",
    "marketplace.gcr.io": "gcloud"
  }
}

I can send a PR to do that, or I can do something else you'd prefer

fejta · 2020-02-19T23:56:59Z

Actually I don't even think that's necessary, this should just work as expected: https://github.com/kubernetes-sigs/k8s-container-image-promoter/blob/b7378030785b6cf6da2cde095eb68ff06483712b/pkg/filepromoter/token.go#L40

fejta · 2020-02-19T23:57:10Z

/assign

listx · 2020-02-20T00:25:24Z

Linus, do you have opinions about the way you want these jobs to run gcloud auth configure-docker?

It isn't obvious to me that google/cloud-sdk has comes preconfigured for docker. Most obvious solution would be to add a layer that adds this file to /root/.docker/config.json:
{
  "credHelpers": {
    "gcr.io": "gcloud",
    "us.gcr.io": "gcloud",
    "eu.gcr.io": "gcloud",
    "asia.gcr.io": "gcloud",
    "staging-k8s.gcr.io": "gcloud",
    "marketplace.gcr.io": "gcloud"
  }
}
I can send a PR to do that, or I can do something else you'd prefer

FWIW I already set up a json like that in the Prow jobs. E.g. https://github.com/kubernetes-sigs/k8s-container-image-promoter/blob/90a75d7781a432cd5e9b56446af4d49747ff986b/test-e2e/cip/e2e-entrypoint-from-container.sh#L31-L34

listx · 2020-03-10T21:48:24Z

The kubernetes/test-infra#16724 PR reverted kubernetes/test-infra#16463, because we saw permissions issues like this:

E0310 19:54:40.826604      11 inventory.go:1621] Request {{0 gcr.io/k8s-staging-cluster-api eu.gcr.io/k8s-artifacts-prod/cluster-api k8s-infra-gcr-promoter@k8s-artifacts-prod.iam.gserviceaccount.com cluster-api-controller cluster-api-controller sha256:b36ed8334a9d95116eb6dfb84eb54e58319fb8a3b2b4d97e014381e6ce144b2e  v0.3.0} <nil>}: error(s) encountered: [{running writeImage() failed to copy index: GET https://eu.gcr.io/v2/token?scope=repository%3Ak8s-artifacts-prod%2Fcluster-api%2Fcluster-api-controller%3Apush%2Cpull&service=eu.gcr.io: UNAUTHORIZED: You don't have the needed permissions to perform this operation, and you may have invalid credentials. To authenticate your request, follow the steps in: https://cloud.google.com/container-registry/docs/advanced-authentication}]

This was most likely caused by the lack of a WI permission binding that needed to be actuated in the k8s-artifacts-prod project for the Prow bot account, like so:

gcloud iam service-accounts add-iam-policy-binding \
  --role roles/iam.workloadIdentityUser \
    --member "serviceAccount:k8s-prow-builds.svc.id.googtest-pods/k8s-artifacts-prod" \
      k8s-infra-gcr-promoter@k8s-artifacts-prod.iam.gserviceaccount.com

I need to add this binding into the infra scripts set up here.

listx · 2020-03-11T22:47:46Z

Pasting from kubernetes/k8s.io#655 (comment):

I think we have to actuate this first, then un-revert kubernetes/test-infra#16463 and see if the ci-k8sio-cip job (or post-k8sio-cip job) still works. Once all the jobs have migrated to WI, then the JSON keys can be revoked.

/cc @thockin

listx · 2020-03-24T02:17:37Z

Once kubernetes/test-infra#16917 merges I will have a much better understanding of WI, enabling me to port the remaining Prow jobs to WI systematically. The concept is simple but it's a bit of grunt work to get all the details right.

listx · 2020-03-25T22:50:01Z

Since kubernetes/k8s.io#695 and kubernetes/test-infra#16948, the ci-k8sio-cip job is working!

I have some other more urgent matters to attend to (need to fixup #199), but I now understand the pattern to apply to workload-identity-ize the rest of the jobs.

fejta · 2020-04-10T23:46:00Z

/unassign
/assign @listx

Awesome!

fejta-bot · 2020-09-02T22:10:34Z

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

listx · 2020-09-03T16:41:12Z

/remove-lifecycle stale

fejta-bot · 2021-01-14T05:14:27Z

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

fejta-bot · 2021-02-13T05:59:15Z

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle rotten

fejta-bot · 2021-03-15T06:46:05Z

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-contributor-experience at kubernetes/community.
/close

k8s-ci-robot · 2021-03-15T06:46:09Z

@fejta-bot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-contributor-experience at kubernetes/community.
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

k8s-ci-robot assigned fejta Feb 19, 2020

This was referenced Feb 24, 2020

Use pusher for artifact promoter kubernetes/test-infra#16460

Merged

Migrate and/or prepare to migrate cip jobs to workload-identity kubernetes/test-infra#16463

Merged

listx mentioned this issue Mar 10, 2020

empower KSA used in the promoter's prow jobs to use Workload Identity kubernetes/k8s.io#655

Merged

k8s-ci-robot assigned listx and unassigned fejta Apr 10, 2020

listx added the priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. label Jun 4, 2020

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Sep 2, 2020

k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Sep 3, 2020

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 14, 2021

k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Feb 13, 2021

k8s-ci-robot closed this as completed Mar 15, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

use workload identity for all prow integrations #180

use workload identity for all prow integrations #180

listx commented Feb 13, 2020

listx commented Feb 13, 2020

fejta commented Feb 13, 2020

fejta commented Feb 13, 2020

listx commented Feb 14, 2020

fejta commented Feb 14, 2020 •

edited

Loading

listx commented Feb 14, 2020 •

edited

Loading

jonjohnsonjr commented Feb 14, 2020

fejta commented Feb 14, 2020

jonjohnsonjr commented Feb 19, 2020

fejta commented Feb 19, 2020

fejta commented Feb 19, 2020

fejta commented Feb 19, 2020

listx commented Feb 20, 2020

listx commented Mar 10, 2020

listx commented Mar 11, 2020

listx commented Mar 24, 2020

listx commented Mar 25, 2020

fejta commented Apr 10, 2020

fejta-bot commented Sep 2, 2020

listx commented Sep 3, 2020

fejta-bot commented Jan 14, 2021

fejta-bot commented Feb 13, 2021

fejta-bot commented Mar 15, 2021

k8s-ci-robot commented Mar 15, 2021

use workload identity for all prow integrations #180

use workload identity for all prow integrations #180

Comments

listx commented Feb 13, 2020

listx commented Feb 13, 2020

fejta commented Feb 13, 2020

fejta commented Feb 13, 2020

listx commented Feb 14, 2020

fejta commented Feb 14, 2020 • edited Loading

listx commented Feb 14, 2020 • edited Loading

jonjohnsonjr commented Feb 14, 2020

fejta commented Feb 14, 2020

jonjohnsonjr commented Feb 19, 2020

fejta commented Feb 19, 2020

fejta commented Feb 19, 2020

fejta commented Feb 19, 2020

listx commented Feb 20, 2020

listx commented Mar 10, 2020

listx commented Mar 11, 2020

listx commented Mar 24, 2020

listx commented Mar 25, 2020

fejta commented Apr 10, 2020

fejta-bot commented Sep 2, 2020

listx commented Sep 3, 2020

fejta-bot commented Jan 14, 2021

fejta-bot commented Feb 13, 2021

fejta-bot commented Mar 15, 2021

k8s-ci-robot commented Mar 15, 2021

fejta commented Feb 14, 2020 •

edited

Loading

listx commented Feb 14, 2020 •

edited

Loading