Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kops - Don't use computeMetadata for retrieving public IP #16388

Merged
merged 1 commit into from
Feb 20, 2020

Conversation

rifelpet
Copy link
Member

@rifelpet rifelpet commented Feb 20, 2020

Since workload identity was enabled on the e2e clusters, presubmit jobs have been failing because the computeMetadata endpoints are no longer available.

We were relying on it to get our public ip, falling back to ifconfig.co.
The fallback logic wasnt working though: the response body isn't empty, its Not Found.
I'm moving the logic from kops-e2e-runner.sh, which our periodic jobs have already been migrated away from, to kubetest and only using ifconfig.co. This way both our presubmit and periodic e2e jobs use the same logic for restricting access to the clusters. And it gets us one step closer to moving off of kops-e2e-runner.sh entirely :)

I'm not a big fan of relying on this third party service for such a simple use case, so if anyone has better ideas on how to get our public IP any feedback would be appreciated.

@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Feb 20, 2020
@k8s-ci-robot k8s-ci-robot added area/images area/kubetest sig/testing Categorizes an issue or PR as relevant to SIG Testing. labels Feb 20, 2020
@rifelpet
Copy link
Member Author

/assign @BenTheElder

@rifelpet
Copy link
Member Author

/cc @fejta

@BenTheElder
Copy link
Member

/approve
/lgtm
/hold

@k8s-ci-robot k8s-ci-robot added lgtm "Looks good to me", indicates that a PR is ready to be merged. do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. labels Feb 20, 2020
} else {
var b bytes.Buffer
if err := httpRead("https://v4.ifconfig.co", &b); err != nil {
return err
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps log something about this?

Copy link
Member

@BenTheElder BenTheElder Feb 20, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Er about using this value to set admin access (commented on the wrong line)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done 👍

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: BenTheElder, rifelpet

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Feb 20, 2020
Since workload identity was enabled on the e2e clusters, jobs have been failing because the computeMetadata endpoints are no longer available.

We were relying on it to get our public ip, falling back to ifconfig.co.
The fallback logic wasnt working though, so I'm moving the logic from kops-e2e-runner.sh (which our periodic jobs have been migrated away from) to kubetest, and only using ifconfig.co.

I'm not a big fan of relying on this third party service for such a simple use case, so if anyone has better ideas on how to get our public IP any feedback would be appreciated.
@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Feb 20, 2020
@hakman
Copy link
Member

hakman commented Feb 20, 2020

/lgtm
/unhold

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Feb 20, 2020
@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Feb 20, 2020
@k8s-ci-robot k8s-ci-robot merged commit 1a31b6a into kubernetes:master Feb 20, 2020
@k8s-ci-robot k8s-ci-robot added this to the v1.18 milestone Feb 20, 2020
@@ -59,16 +59,6 @@ if [[ "${KOPS_DEPLOY_LATEST_KUBE:-}" =~ ^[yY]$ ]]; then
e2e_args+=(--kops-kubernetes-version="${KOPS_KUBE_RELEASE_URL}/${KOPS_KUBE_LATEST}")
fi

EXTERNAL_IP=$(curl -SsL -H 'Metadata-Flavor: Google' 'http://metadata.google.internal/computeMetadata/v1/instance/network-interfaces/0/access-configs/0/external-ip' || true)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternatively add a service account to this job

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does have a service account preset defined here and this log seems to indicate it is being used. Is that not sufficient?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does adding a preset-service-account: "true" label to the pod do here?

By service account I mean the pod needs a kubernetes service account. That will look something like this:

We declare the service accounts that exist for jobs in these two locations, depending on the cluster they run in:

# TODO(fejta): better define these rules and enforce them in presubmit.
iam.gke.io/gcp-service-account: k8s-infra-gcr-promoter@k8s-gcr-audit-test-prod.iam.gserviceaccount.com
name: k8s-gcr-audit-test-prod

annotations:
iam.gke.io/gcp-service-account: [email protected]
name: deployer

The annotation controls which GCP service account metadata server (and thus the k8s service account) authenticates as

If the pod does not define a service account (default behavior) identity for its containers then GKE does not provide those container with a metadata server.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The log is showing that the job has explicit credentials in a secret file passed to the job, which it uses for authentication (the secret contents of that file authenticates the user, the metadata server is not involved here).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah ok, so this service account is separate from the serviceAccountName of the pod. That makes sense.

So we would be expected to create a new ServiceAccount resource in build_serviceaccounts.yaml correct? Which gcp-service-account would I want to use? Clearly this doesn't actually require GCP permissions, just access to the metadata server itself.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use [email protected], which is what the secrets/service-account file authenticates as. Maybe add a kops service account to that build_serviceaccounts.yaml file?

If you want to do this (which would be fantastic) then send me the PR. (Once it goes it you should be able to remove the preset-service-account: true label from your pods and not have to deal with the secret.json file any more too)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/images area/kubetest cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. sig/testing Categorizes an issue or PR as relevant to SIG Testing. size/S Denotes a PR that changes 10-29 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants