-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(frontend&backend): Add UI support for object store customization and prefixes #10787
Conversation
I'll take a look at the failing tests |
/test kubeflow-pipeline-frontend-test |
1 similar comment
/test kubeflow-pipeline-frontend-test |
/test kubeflow-pipeline-frontend-test Flakey tests seem to be failing I think, retrying a couple of times to confirm (I'm unable to reproduce this in my environment). Edit: yes that seemed to be the case, they are passing now. |
@HumairAK feel free to ping me on slack or join the security meeting if you are interested in fixing exploit mentioned above. |
And I am also investigating Apache ozone as a minio replacement. |
@juliusvonkohout yes this is something we 100% need to fix I'm currently limited on bandwidth, but if in a couple of weeks if this is not picked up by someone else I can take it on |
Rebased, and I've updated with detailed testing instructions and also provided already built images from the pr images in order to help speed up review on this. |
* If providerInfo is not provided or, if credentials are sourced fromEnv, | ||
* then, if using aws s3 (via provider chain or instance profile), create a | ||
* minio client backed by aws s3 client. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minio is the default deployment approach, Can we make sure this continues to be the case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@zijianjoy yes, the above description describes the aws case, if aws is not specified, we default to minio when using default deployment manifests (since that is what is configured in the kfp ui environment), see description below this paragraph
/lgtm Approved from the frontend side. CC @chensun for the backend change in this PR. |
cc @chensun |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
/approve
Thanks!
/retest-required |
Signed-off-by: Humair Khan <[email protected]>
Note that for runlist, the mock error response only returns one valid run list with error set, the other is undefined, so to support multiple runIds the mock error response will need to be adjusted. Signed-off-by: Humair Khan <[email protected]>
Signed-off-by: Humair Khan <[email protected]>
Signed-off-by: Humair Khan <[email protected]>
Signed-off-by: Humair Khan <[email protected]>
Signed-off-by: Humair Khan <[email protected]>
..and clean up imports. Signed-off-by: Humair Khan <[email protected]>
Signed-off-by: Humair Khan <[email protected]>
rebased to pick up the new github ci tests |
/test kubeflow-pipeline-e2e-test |
@HumairAK: The specified target(s) for
The following commands are available to trigger optional jobs:
Use
In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: chensun The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
@HumairAK: The following tests failed, say
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
@zijianjoy / @chensun all e2e tests are also passing, no new code changes have been introduced since the rebase, can we get a refresh on the "lgtm" and get this merged? |
/lgtm |
@HumairAK where exactly did you fix #9889 ? I see only "Because we don't do namespace verification server side, #9889 where users can dupe the namespace client side and access object stores from another profile namespace. Because we are storing the provider's creds location in mlmd and querying it from client side, client can technically dupe this, and attempt to read object stores if their secrets are stored else where on the cluster. However, once we resolve #9889 this should no longer be an issue, because any attempt to access resources outside the user's profile namespace should be rejected." which says the opposite. |
@juliusvonkohout this pr did not address #9889, in this post I merely made note of it. As this is a separate issue it is out of scope of this PR (it was already very long). Though I agree we need to prioritize this, however we should do it as a separate effort. |
Description of your changes:
Frontend Follow up to: #10625
Resolves the object storage portion of: #9689
Changes to backend:
For each artifact, add an artifact property in mlmd that keeps track of how to track down where it is located (i.e. which provider, where are the credentials, etc.). This is the same information that is stored as a context property, introduced in Recursion support issue #1065, but because each artifact is not tied to a specific context (import or cached artifacts for example), we propogate these to their artifact properties (when newly created within a context).
We assume import artifacts will leverage the environment for it's credentials (e.g. via IRSA), this means both the user executor pod, and kfp ui will need these creds in their env (this is true for any situation where we use fromEnv: true). A future change could also add support for matching import artifacts with prefixes defined in kfp-launcher config.
Additionally, this PR adds the UI support for changes introduced in #10625. Users will now be able to add to their kfp-launcher configmaps the locations of their s3 buckets based on the provided pipeline root, either via the default pipeline root, or one specified during pipeline execution. When using "fromEnv: true" the onus is on the user, as well as admin (that manages kfp) to ensure that the user pod and kfp ui have these credentials available. Following updates to the UI went in here:
Changes to client code:
Changes to front end server:
fromenv=false
then we require the session info to include the location of the credentials (k8s secret) and utilize those. If namespace is not provided, we reject the request.Notes:
The credential secrets are required to be in the user profile namespace, though this seems to have already been the case before [1], [2], [3]
Because we don't do namespace verification server side, there continues to be a security concern where users can dupe the namespace client side and access object stores from another profile namespace. Because we are storing the provider's creds location in mlmd and querying it from client side, client can technically dupe this, and attempt to read object stores if their secrets are stored else where on the cluster. However, once we resolve [backend] Security exploit in mlpipeline-UI #9889 this should no longer be an issue, because any attempt to access resources outside the user's profile namespace should be rejected.
With some additional changes, we can allow users to set another default object store, and resolve [feature] Allow manifests to set external Object Store to deploy Kubeflow Pipelines #10651. The additional change will require kfp to read a default kfp launcher from the kubeflow namespace (or wherever kfp is deployed).
Once accepted, I will follow up with documentation on this.
configmap
Testing Instructions
To test this PR you will need to:
You can also use these images here:
quay.io/hukhan/kfp-frontend:issue-10787
quay.io/hukhan/kfp-launcher:issue-10787
quay.io/hukhan/kfp-driver:issue-10787
Frontend
quay.io/hukhan/kfp-frontend:issue-10787
substitute this in the frontend deployment:
Driver/launcher:
In the KFP API Server deployment, add the environment variables:
V2_LAUNCHER_IMAGE=quay.io/hukhan/kfp-launcher:issue-10787
V2_DRIVER_IMAGE=quay.io/hukhan/kfp-driver:issue-10787
Deploy KFP, I recommend using a full kfp deployment so you can test with a multi-user setup.
Create a kfp-launcher config in the user namespace that lists your S3/GCS/Minio credentials, for the formatting see the instructions in this PR..
Any secret you mention in kfp-launcher also needs to be added to the user's profile namespace.
Upload a pipeline that outputs artifacts
Some pipleines to test with:
Run a pipeline, select a pipeline root that matches one from the kfp-launcher config (once again the kfp-launcher config and the secret to utilize should be in the user's profile namespace)
test their visuals in the UI, key areas to test:
Checklist: