-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Restrict privilege of Kubeflow services accounts such as tf-job-operator to namespace level #1213
Comments
We'll also need to restrict the operators to only claim resources in specific namespaces. Not sure whether that's possible today. |
Do we want to handle all TFJobs in one operator and create pods in the specified namespace or one operator for one namespace? |
@gaocegege We are in a situation where 1. cluster admins don't own Kubeflow deployment 2. only one cluster tenant needs to run TF jobs. So there is no need to have an omnipotent operator to own all TFJobs across namespaces. In the future it may be desirable for cluster admins to take ownership of TF operator, to consolidate Kubeflow into cluster infrastructure. That would however requires cluster admins to take on additional workload and training. |
After kubeflow/training-operator#789 tf-operator can handle two cases
There is the remaining case where we want the TFJob operator to handle a subset of namespaces. /cc @johnugeorge |
@ankushagarwal Now that kubeflow/training-operator#789 is submitted what are the next steps? Should we add options to our ksonnet prototypes to allow scoping of service accounts? |
This is done. I created a new image for tf operator and updated ksonnet prototypes for tf-job to namespace scope the operator. |
Thank you Jeremy and Ankush! |
I will add this to pytorch operator after the initial structure is added |
I used grep to search for all the ClusterRoles. There's quite a few others defined. Here are the ones I think are most important (as opposed to optional components)
I don't think we will be able to get this fixed in 0.3; 0.3 is already over subscribed so I don't think there's any room for new items. Additionally I think we need a way to make it easy for users to scope Kubeflow to a particular namespace when its installed. We could then use this to create an E2E test for Kubeflow scoped to a namespace. We might also want to configure Kubeflow so that users work in a namespace that's different from the one where Kubeflow is installed. Complete list
|
@jlewi Thank you Jeremy for the follow up. I didn't realize re-scoping the operators could incur so much work. We also tried tweaking the configs to use RoleBinding instead of ClusterRoleBinding on our side, but also arrived at a dead end. I think I may have raised a bad feature request that goes against Kubernetes' operator design pattern. Our project requirement has recently changed. We are now exploring the single-tenant option as requested by the client. Our need to narrow the privilege scope of Kubeflow operators has diminished. Since this issue is no longer blocking, I think we can re-pri or close it if there's no other party requesting this feature. Really sorry for the trouble caused. |
Hello Guys! |
We are trying to setup generic Kubernetes clusters on bare-metal machines. Our cluster serves multiple other teams that are separated by K8s namespaces. One of the teams wants to use Kubeflow for TF model training, but as we were installing Kubeflow on the cluster, we discovered that some service accounts such as tf-job-operator is requesting cluster-level access to most of the resources. For the fear of compromising cluster security, we stalled the installation.
Can we limit the privilege of Kubeflow SAs to namespace level? We cluster admins can help the namespace owners to setup Kubeflow once. Afterwards we will hand-off to the namespace owners for their day-to-day operations. So long as Kubeflow doesn't affect other namespaces we are good.
The text was updated successfully, but these errors were encountered: