[gcp-deployer] Setup oauth2 id & secret; insert service account keys via backend service #1255

kunmingg · 2018-07-20T23:00:44Z

[gcp-deployer]
Web app will integrate with a go backend service.
All future api calls dealing with GCP/k8s resources management show go through backend.
Backend service update will come in separate PRs.

This change is

kunmingg · 2018-07-20T23:02:26Z

/hold

jlewi · 2018-07-21T00:02:19Z

/lgtm
/assign @yebrahim

Looks good to me but lets wait for Yasser.

@kunmingg Why is there a hold on the PR?

yebrahim

Thanks for taking care of this. I do have a few comments though, especially when we can run the code to get the cluster endpoint.

yebrahim · 2018-07-21T04:08:54Z

components/gcp-click-to-deploy/src/DeployForm.tsx

+    const token = await Gapi.getToken();
+    if (!token) {
+      this.setState({
+          error: 'You are not signed in',


At this point the user must be signed in, and some other error must have occurred, so we should just say that, something like: "Error occurred getting Google ID token." This error shouldn't happen though.

yebrahim · 2018-07-21T04:11:06Z

components/gcp-click-to-deploy/src/DeployForm.tsx

@@ -402,8 +404,43 @@ export default class DeployForm extends React.Component<any, DeployFormState> {
        });
      });

+    // Step 4: In cluster resource set up
+    const endpoint = Gapi.getClusterEndpoint(project, this.state.zone, deploymentName);
+    k8s.Config.defaultClient();


I don't think you need this line.

yebrahim · 2018-07-21T04:20:22Z

components/gcp-click-to-deploy/src/DeployForm.tsx

@@ -412,6 +449,8 @@ export default class DeployForm extends React.Component<any, DeployFormState> {

    const servicesToEnable = new Set([
      'deploymentmanager.googleapis.com',
+      'servicemanagement.googleapis.com',


I had this in the list initially, but I never saw it when listing services for the project, which is what I'm using to decide if a service is enabled, so the code ends up retrying to enable this service over and over until it quits. Are you sure this would work?

Are we keeping this? I think it'll cause the service enable API to keep looping forever.

yebrahim · 2018-07-21T04:22:12Z

components/gcp-click-to-deploy/src/Gapi.ts

+            name: `projects/${projectId}/serviceAccounts/${serviceAccounts}`
+        }).then(response => response.result,
+                badResult => {
+          throw new Error('Errors creating key for serviceAccount ${serviceAccounts}');


Let's add the error here (even by JSON stringifying it) in the error message, so that the user at least has an idea what might've gone wrong, even if it's a big ugly error object.

yebrahim · 2018-07-21T04:23:42Z

components/gcp-click-to-deploy/src/Gapi.ts

@@ -47,6 +47,26 @@ export default class Gapi {

  }

+  public static iam = class {
+
+    public static async createKey(projectId: string, serviceAccounts: string) {


nit: 2 space indentation all over.

yebrahim · 2018-07-21T04:24:25Z

components/gcp-click-to-deploy/src/Gapi.ts

+      return user ? user.getAuthResponse().id_token : null;
+  }
+
+    public static async getClusterEndpoint(projectId: string, zone: string, clusterId: string) {


indentation.

yebrahim · 2018-07-21T04:25:35Z

components/gcp-click-to-deploy/src/Gapi.ts

+            path: `https://container.googleapis.com/v1/projects/${projectId}/zones/${zone}/clusters/${clusterId}`
+        }).then(response => (response.result as any).endpoint,
+            badResult => {
+                throw new Error('Errors getting cluster endpoint: ' + flattenDeploymentOperationError(badResult.result));


I'm not sure this method would work on this error object structure, if you can verify that, we can use this method but let's rename it, maybe "flattenResponseError" or something.

yebrahim · 2018-07-21T04:27:48Z

components/gcp-click-to-deploy/src/DeployForm.tsx

@@ -402,8 +404,43 @@ export default class DeployForm extends React.Component<any, DeployFormState> {
        });
      });

+    // Step 4: In cluster resource set up
+    const endpoint = Gapi.getClusterEndpoint(project, this.state.zone, deploymentName);


I don't think this will work here. Cluster creation + kubeflow deployment will take several minutes, so we can't just issue this call right away, we'll need some experience for waiting.
If this code is here as a placeholder, can we either comment it out or gate it on a condition? I can work with it as I'm editing the waiting experience.

kunmingg · 2018-07-26T22:52:41Z

@yebrahim @jlewi
Thanks for reviewing, tested locally, found one blocker:
chrome won't allow requests to self-signed-cert address (like our k8s cluster).
Haven't found neat workaround...

How about give master ip a DNS though cloud endpoint?
Thoughts?

jlewi · 2018-07-27T05:17:21Z

@kunmingg That won't work; we need the secret in order to be able to create the endpoint and get the signed certificate. In chrome if you try to navigate to a page with a self-signed cert it will give a warning and you can click through to accept it. is it not possible to do that in JS code?

jlewi · 2018-07-27T05:19:23Z

@kunmingg @yebrahim Could we prompt the user to install the self signed cert for the K8s master in their browser?

kunmingg · 2018-07-27T18:17:29Z

Found an workaround, now need another service:
#1276

yebrahim · 2018-07-27T19:51:14Z

What kind of auth is currently used for the kubeflow cluster by default? If it's just basic username/password auth, then you don't want to open CORS for anyone to talk to the k8s master.

yebrahim · 2018-07-27T20:02:06Z

@jlewi I don't know if that's doable, never come across this before. This seems like just a CORS issue though? If we add localhost and kubeflow.org to CORS whitelist origins, and use Bearer token for auth, should this take care of it?

kunmingg · 2018-07-27T20:45:50Z

@yebrahim
To add our hosts to CORS whitelist origins, I plan to host a CORS proxy on kubeflow.org (#1276).
(or let user use chrome extension)
Sounds good?

yebrahim · 2018-07-27T22:36:15Z

If you're planning to add the hosts to the origins whitelist, why do you need the proxy? The browser should just be able talk directly to the k8s api server.

jlewi · 2018-07-27T22:36:45Z

A proxy is highly undesirable. We don't want users to have to provide their credentials to a backend service they don't control.

@yebrahim We are connecting to the K8s master e.g.
https://XX.XXX.XXX

And we authenticate using a bearer token in the authorization header. So its not localhost; its the ip of the K8s master.

kunmingg · 2018-07-27T22:54:13Z

@yebrahim
We don't own k8s api server, can not directly edit from there.

yebrahim · 2018-07-27T23:30:53Z

@kunmingg I was under the assumption you can specify the origin whitelist while deploying the k8s cluster, is that not correct?

kunmingg · 2018-07-27T23:47:21Z

@yebrahim
I think on GKE we do not have access to k8s master config.

yebrahim · 2018-07-28T16:44:18Z

If this is the case, that's a huge bummer!
We should make sure of this before we write a server component to proxy requests. From brief googling, I can see there is a CORS annotation for nginx ingress controller, but I don't know if kubeflow is using that.

Otherwise, seems like a server piece is unavoidable, @jlewi. The API server just won't accept requests coming from the browser without the CORS annotation.

jlewi · 2018-07-29T23:54:04Z

@kunmingg @yebrahim My suggestion is to not block this PR on the CORS issue. I believe its possible to disable CORS in the browser using various extensions.

I think here's one for chrome
https://chrome.google.com/webstore/detail/allow-control-allow-origi/nlfbmbojpeacfghkpbjhddihlkkiljbi?hl=en

So my suggestion is that we use such extensions as a workaround for now so we can continue to make progress.

Thoughts?

kunmingg · 2018-07-30T03:52:51Z

@jlewi @yebrahim
I can deploy k8s components through deployment manager:
we might have to deploy multi times per click.

In future if we change bootstrapper server to a k8s operator, we can deploy all components without accessing k8s API directly.

jlewi · 2018-07-30T12:33:08Z

Which K8s resources? I think using DM to create a single resource e.g. the bootstrapper/click to deploy APP might be ok.

Overall though I think using DM to create K8s resources leads to bad UX because its unclear if a user wants to update those resources whether to do so using DM or by talking directly to K8s ApiServer.

yebrahim

LGTM overall, just one or two critical things, the rest is nits.

yebrahim · 2018-08-01T04:02:24Z

components/gcp-click-to-deploy/src/DeployForm.tsx

+              this._appendLine("Cluster endpoint not available yet.")
+            });
+      if (!curStatus) {
+        wait(getTimeout);


You need to await this.

yebrahim · 2018-08-01T04:04:39Z

components/gcp-click-to-deploy/src/DeployForm.tsx

-        this._generateServiceAccountSecret(project, deploymentName, 'admin'));
-    k8sApi.createNamespacedSecret('kubeflow',
-        this._generateServiceAccountSecret(project, deploymentName, 'user'));
+    const saKey = await Gapi.iam.createKey(


nit: use a more expressive name? maybe serviceAccountKey?

yebrahim · 2018-08-01T04:05:36Z

components/gcp-click-to-deploy/src/DeployForm.tsx

-  }
+    kubeflowUtil.properties.clusterType = this.state.deploymentName + '-type';
+    kubeflowUtil.properties.saKey = saKey.privateKeyData;
+    kubeflowUtil.properties.clientId = window.btoa(this.state.clientId);


nit: you don't need the window. bit, you can call btoa directly.

yebrahim · 2018-08-01T04:08:24Z

components/gcp-click-to-deploy/src/Gapi.ts

+      return gapi.client.request({
+          method: 'POST',
+          path: `https://iam.googleapis.com/v1/projects/${projectId}/serviceAccounts/${serviceAccounts}/keys`,
+        }).then(response => {


nit: you can simplify this to one line: .then(response => response.result as CreateKeyResponse)

yebrahim · 2018-08-01T04:09:21Z

components/gcp-click-to-deploy/src/Gapi.ts

+                return response.result as CreateKeyResponse;
+            },
+          badResult => {
+          throw new Error('Errors creating key for serviceAccount ${serviceAccounts}: '


Use backticks "`" for string formatting.

jlewi · 2018-08-01T13:58:04Z

components/gcp-click-to-deploy/src/configs/cluster-kubeflow-util.jinja

+      name: admin-gcp-sa
+    type: Opaque
+    data:
+      admin-gcp-sa.json: {{ properties["saKey"] }}


This is a smart idea; using DM to insert the service account key into the cluster. Unfortunately this means the secret key would be stored in DM and viewable by anyone with access to the project. I don't think we want that.

An alternative approach would be to just use the service account as the service account for the GKE nodes.
The downside of that approach would be that it means all pods on that node would run with elevated privileges which would be undesirable.

Its possible that we could use RBAC to restrict what can run on a specific node pool so that other processes couldn't run on that node pool.

Currently our code assume those keys will be stored in k8s cluster, so anyone with project access can view from cluster secrets?

Another way is maybe we can expire those keys once we finish deploy?
Then user need to rely on their own credentials to access GCS / GCR / big query, which should be better for access control.

By using VM account for nodes, we should be able to avoid exposing secrets.
Handled in separate PR: #1302

I think putting them in DM might be an enhanced risk because K8s secrets are namespace scoped resource so not everyone might have access to the namespace. I'm not even sure project viewers do.

jlewi · 2018-08-02T01:48:21Z

@kunmingg it looks like this PR is doing two things

Inserting Service accounts via DM
Adding OAuthClientID and OAuthSecret for IAP

Per our discussion today I think the plan is to have a go backend. So I think we can just have the go backend create the secrets and insert them in the cluster. The go backend can talk directly to the K8s master.

kunmingg · 2018-08-03T23:09:24Z

Update:
per our discussion, will use backend to deal with k8s resource management.
Separate PRs will take care backend part.

kunmingg · 2018-08-04T00:14:13Z

/retest

add client id & secret fields; manage k8s resource through deployment manager;

From now on web app create k8s resources through backend service.

jlewi · 2018-08-05T20:59:25Z

components/gcp-click-to-deploy/src/DeployForm.tsx

@@ -25,6 +26,9 @@ import clusterJinjaPath from './configs/cluster.jinja';
 // selects auto domain then we should automatically supply the suffix
 // <hostname>.endpoints.<Project>.cloud.goog

+// Assume user access app via kubectl proxy
+const BackendAddress = 'http://127.0.0.1:8001/api/v1/namespaces/default/services/kubeflow-controller:8080/proxy/';


Why do we need to make this assumption? Can we parameterize this based on how the user accessed the click to deploy app?

What if we had our go backend serve the click to deploy app? i.e. can we just add a handler to our go server that returns the javascript click to deploy app?

Will make it a parameter in yaml spec.
Serving from go side might be even better, I have other PRs for backend change, let's pick it up there later.

jlewi · 2018-08-05T20:59:50Z

Thanks. Can you update the PR description as well please?

jlewi

Reviewable status: 0 of 6 files reviewed, 20 unresolved discussions (waiting on @yebrahim, @jlewi, @kunmingg, @gaocegege, and @ankushagarwal)

components/gcp-click-to-deploy/kf_app.yaml, line 1 at r3 (raw file):

apiVersion: v1

Add a comment explaining what this manifest is for.

components/gcp-click-to-deploy/kf_app.yaml, line 40 at r3 (raw file):

        image: gcr.io/kubeflow-images-public/bootstrapper:latest
        workingDir: /opt/bootstrap
        command: [ "/opt/kubeflow/bootstrapper"]

nit put args into command; command should just be a list containing the binary to run and then the command line arguments.

components/gcp-click-to-deploy/kf_app.yaml, line 44 at r3 (raw file):

          "--in-cluster",
          "--namespace=kubeflow",
          "--apply",

What is the --apply argument doing shouldn't we be invoking it in daemon mode?

components/gcp-click-to-deploy/src/DeployForm.tsx, line 408 at r1 (raw file):

Previously, yebrahim (Yasser Elsayed) wrote…

I don't think this will work here. Cluster creation + kubeflow deployment will take several minutes, so we can't just issue this call right away, we'll need some experience for waiting.
If this code is here as a placeholder, can we either comment it out or gate it on a condition? I can work with it as I'm editing the waiting experience.

Is this fixed?

components/gcp-click-to-deploy/src/DeployForm.tsx, line 30 at r3 (raw file):

interface DeployFormState {

Using an argument and specifying the YAML seems good to me.

jlewi · 2018-08-06T20:38:51Z

Per our discussion I think all the logic for deploying things should now move into the go backend; but we can do that in a follow on PR.

… yaml per user's need

jlewi · 2018-08-06T23:40:52Z

/lgtm
/approve

k8s-ci-robot · 2018-08-06T23:40:55Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: jlewi

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [jlewi]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

kunmingg · 2018-08-07T17:49:42Z

/hold cancel

…via backend service (kubeflow#1255) * insert service account keys to GKE cluster * handle review feedbacks; add client id & secret fields; manage k8s resource through deployment manager; * Refactor: From now on web app create k8s resources through backend service. * rebase * patch yaml spec missing pieces, make backend URL configurable through yaml per user's need

kunmingg requested review from jlewi and yebrahim July 20, 2018 23:00

k8s-ci-robot requested review from ankushagarwal and gaocegege July 20, 2018 23:00

k8s-ci-robot added the size/M label Jul 20, 2018

kunmingg changed the title ~~insert service account keys to GKE cluster~~ [gcp-deployer] insert service account keys to GKE cluster Jul 20, 2018

k8s-ci-robot added the do-not-merge/hold label Jul 20, 2018

k8s-ci-robot assigned yebrahim and jlewi Jul 21, 2018

k8s-ci-robot added the lgtm label Jul 21, 2018

yebrahim suggested changes Jul 21, 2018

View reviewed changes

k8s-ci-robot removed the lgtm label Jul 31, 2018

yebrahim reviewed Aug 1, 2018

View reviewed changes

jlewi reviewed Aug 1, 2018

View reviewed changes

kunmingg mentioned this pull request Aug 1, 2018

allow IAP setup using VM service account when no service account was set #1302

Closed

kunmingg changed the title ~~[gcp-deployer] insert service account keys to GKE cluster~~ [gcp-deployer] Setup oauth2 id & secret; insert service account keys via backend service Aug 3, 2018

kunmingg added 4 commits August 3, 2018 17:39

insert service account keys to GKE cluster

ed69f23

handle review feedbacks;

e73025b

add client id & secret fields; manage k8s resource through deployment manager;

Refactor:

919b511

From now on web app create k8s resources through backend service.

rebase

aa69203

kunmingg force-pushed the web-app branch from 9febf3b to aa69203 Compare August 4, 2018 00:53

jlewi reviewed Aug 5, 2018

View reviewed changes

jlewi suggested changes Aug 6, 2018

View reviewed changes

patch yaml spec missing pieces, make backend URL configurable through…

e81d7c2

… yaml per user's need

googlebot added the cla: yes label Aug 6, 2018

k8s-ci-robot added the lgtm label Aug 6, 2018

k8s-ci-robot added the approved label Aug 6, 2018

jlewi approved these changes Aug 6, 2018

View reviewed changes

ankushagarwal removed their request for review August 7, 2018 05:16

k8s-ci-robot removed the do-not-merge/hold label Aug 7, 2018

k8s-ci-robot merged commit 239dcc2 into kubeflow:master Aug 7, 2018

kunmingg mentioned this pull request Aug 8, 2018

[gcp-deployer] implement go backend logic handling request from deploy/mangement web app. #1328

Closed

surajkota pushed a commit to surajkota/kubeflow that referenced this pull request Jun 13, 2022

Fixes local tests (kubeflow#1255)

5b7c89b

[gcp-deployer] Setup oauth2 id & secret; insert service account keys via backend service #1255

[gcp-deployer] Setup oauth2 id & secret; insert service account keys via backend service #1255

Conversation

kunmingg commented Jul 20, 2018 • edited Loading

kunmingg commented Jul 20, 2018

jlewi commented Jul 21, 2018

yebrahim left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kunmingg commented Jul 26, 2018

jlewi commented Jul 27, 2018

jlewi commented Jul 27, 2018

kunmingg commented Jul 27, 2018

yebrahim commented Jul 27, 2018

yebrahim commented Jul 27, 2018

kunmingg commented Jul 27, 2018

yebrahim commented Jul 27, 2018

jlewi commented Jul 27, 2018

kunmingg commented Jul 27, 2018

yebrahim commented Jul 27, 2018

kunmingg commented Jul 27, 2018

yebrahim commented Jul 28, 2018

jlewi commented Jul 29, 2018

kunmingg commented Jul 30, 2018 • edited Loading

jlewi commented Jul 30, 2018

yebrahim left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kunmingg Aug 1, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jlewi commented Aug 2, 2018

kunmingg commented Aug 3, 2018

kunmingg commented Aug 4, 2018

Choose a reason for hiding this comment

kunmingg Aug 6, 2018 • edited Loading

Choose a reason for hiding this comment

jlewi commented Aug 5, 2018

jlewi left a comment

Choose a reason for hiding this comment

jlewi commented Aug 6, 2018

jlewi commented Aug 6, 2018

k8s-ci-robot commented Aug 6, 2018

kunmingg commented Aug 7, 2018

kunmingg commented Jul 20, 2018 •

edited

Loading

kunmingg commented Jul 30, 2018 •

edited

Loading

kunmingg Aug 1, 2018 •

edited

Loading

kunmingg Aug 6, 2018 •

edited

Loading