-
Notifications
You must be signed in to change notification settings - Fork 402
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature] Ray restricted podsecuritystandards for enterprise security and Kubeflow integration #750
Conversation
This documentation looks very good. Two things from are missing so far.
|
We also checked that kuberay properly has its own serviceaccount https://github.com/ray-project/kuberay/tree/master/ray-operator/config/rbac |
cc @DmitriGekhtman @juliusvonkohout I updated the PR. Can you help me review this PR? Thank you! |
I've exposed the (optional) autoscaler sidecar's security context configuration in this PR: In a separate PR, we should also tweak the Ray cluster Helm chart to expose Ray containers' securityContext fields. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks good to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make sense and looks good to me
Looks good so far. I will try to test it tomorrow. One thing you might should test is installing additional python dependencies in the head and workers during runtime. Just to be sure that the file system permissions are all right and that you cann really roll this out to all users. |
@juliusvonkohout could you clarify how the dependencies should be installed and what behavior you would expect? |
I would just like to see pip install tested as described here https://docs.ray.io/en/latest/ray-core/handling-dependencies.html |
Got it, we can verify that a runtime env with a pip dependency works! |
I just tried Step 4: Install the KubeRay operator# Path: helm-chart/kuberay-operator
helm install -n pod-security kuberay-operator . on an openshift cluster. We might need to remove the seccomp stuff or we will get an error.
So i removed
everywhere Sadly you are using dockerhub so i will have to wait a day "Failed to pull image "kuberay/operator:nightly": rpc error: code = Unknown desc = reading manifest nightly in docker.io/kuberay/operator: toomanyrequests: You have reached your pull rate limit. You may increase the limit by authenticating and upgrading: https://www.docker.com/increase-rate-limit" With a mirror at mtr.devops.telekom.de/juliusvonkohout/kuberay:nightly i at least got the kuberay operator nightly image. I definitely do not recommend to use dockerhub. There is quay.io and other alternatives which do not have those problems. But that is not relevant for this PR. |
Thanks for the suggestion @juliusvonkohout! Would you mind opening an issue discussing that? |
Thank @juliusvonkohout for your review!
|
Alright then this will have to be fixed on the openshift side. They will also switch to PSS soon. Our main target is a normal upstream kubernetes such as kind. I have updated my post. You can ignore the seccomp part then. and on openshift the image is executed with a random non-root uid. There are filesystem permission errors
you need to chmod 777 some directories. so the following has to be done
here is the approach without fixing the filesystem properly and doing the runasuser workaround instead. I strongly suggest to properly fix the filesystem instead of encorcing a particular user. These commands will be helpfull for you
THis also applies for whatever you do in the busybox initcontainer. If possible just use /tmp and you wont have any trouble with permissions. |
@juliusvonkohout Do you have the time to chat the details on Zoom? If so, we can communicate via Kubeflow slack workspace. Thank you! |
And this will fix the seccomp issue https://kubernetes.io/blog/2021/08/25/seccomp-default/ for everyone in the long term. Nothing we have to worry about now. |
Co-authored-by: Dmitri Gekhtman <[email protected]> Signed-off-by: Kai-Hsun Chen <[email protected]>
Co-authored-by: Dmitri Gekhtman <[email protected]> Signed-off-by: Kai-Hsun Chen <[email protected]>
Co-authored-by: Dmitri Gekhtman <[email protected]> Signed-off-by: Kai-Hsun Chen <[email protected]>
Co-authored-by: Dmitri Gekhtman <[email protected]> Signed-off-by: Kai-Hsun Chen <[email protected]>
Co-authored-by: Dmitri Gekhtman <[email protected]> Signed-off-by: Kai-Hsun Chen <[email protected]>
Co-authored-by: Dmitri Gekhtman <[email protected]> Signed-off-by: Kai-Hsun Chen <[email protected]>
Co-authored-by: Dmitri Gekhtman <[email protected]> Signed-off-by: Kai-Hsun Chen <[email protected]>
Co-authored-by: Dmitri Gekhtman <[email protected]> Signed-off-by: Kai-Hsun Chen <[email protected]>
Co-authored-by: Dmitri Gekhtman <[email protected]> Signed-off-by: Kai-Hsun Chen <[email protected]>
Co-authored-by: Dmitri Gekhtman <[email protected]> Signed-off-by: Kai-Hsun Chen <[email protected]>
Co-authored-by: Dmitri Gekhtman <[email protected]> Signed-off-by: Kai-Hsun Chen <[email protected]>
Co-authored-by: Dmitri Gekhtman <[email protected]> Signed-off-by: Kai-Hsun Chen <[email protected]>
a8cef32
to
8149e82
Compare
Resolved the conflict and opened an issue for UID in ray-project/ray#30959. |
Okidoke, looks good to merge! Thanks. |
This PR is based on #750 and #769. It adds a unit test for the document for the Pod security standard. What does this unit test do: * Create namespace pod-security with restricted Pod security policy * Install the operator in the default namespace(another option is install operator in namespace pod-security) * Apply the restricted Pod security standard to all Pods in the namespace pod-security. * test if can create the RayCluster in the namespace pod-security.
… and Kubeflow integration (ray-project#750) Kubernetes defines three different Pod Security Standards, including privileged, baseline, and restricted, to broadly cover the security spectrum. The privileged standard allows users to do known privilege escalations, and thus it is not safe enough for security-critical applications. This PR describes how to configure RayCluster YAML file to apply restricted Pod security standard. Signed-off-by: Kai-Hsun Chen <[email protected]>
…ject#866) This PR is based on ray-project#750 and ray-project#769. It adds a unit test for the document for the Pod security standard. What does this unit test do: * Create namespace pod-security with restricted Pod security policy * Install the operator in the default namespace(another option is install operator in namespace pod-security) * Apply the restricted Pod security standard to all Pods in the namespace pod-security. * test if can create the RayCluster in the namespace pod-security.
Why are these changes needed?
Kubernetes defines three different Pod Security Standards, including
privileged
,baseline
, andrestricted
, to broadlycover the security spectrum. The
privileged
standard allows users to do known privilege escalations, and thus it is notsafe enough for security-critical applications.
This PR describes how to configure RayCluster YAML file to apply
restricted
Pod security standard. The followingreferences can help you understand this document better:
Related issue number
ray-project/ray#29665
Checks
Screenshot for Step 5.2
Followup issues
securityContext
in KubeRay operator (X)restricted
security standard.securityContext
properly on Pods created by Autoscaler. (Done: exposed related interfaces)securityContext
properly on Pods specified by RayCluster YAML file. (X)