Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] Ray restricted podsecuritystandards for enterprise security and Kubeflow integration #750

Merged
merged 21 commits into from
Dec 8, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
121 changes: 121 additions & 0 deletions docs/guidance/pod-security.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,121 @@
# Pod Security

Kubernetes defines three different Pod Security Standards, including `privileged`, `baseline`, and `restricted`, to broadly
cover the security spectrum. The `privileged` standard allows users to do known privilege escalations, and thus it is not
safe enough for security-critical applications.

This document describes how to configure RayCluster YAML file to apply `restricted` Pod security standard. The following
references can help you understand this document better:

* [Kubernetes - Pod Security Standards](https://kubernetes.io/docs/concepts/security/pod-security-standards/#restricted)
* [Kubernetes - Pod Security Admission](https://kubernetes.io/docs/concepts/security/pod-security-admission/)
* [Kubernetes - Auditing](https://kubernetes.io/docs/tasks/debug/debug-cluster/audit/)
* [KinD - Auditing](https://kind.sigs.k8s.io/docs/user/auditing/)

# Step 1: Create a KinD cluster
```bash
# Path: ray-operator/config/security
kind create cluster --config kind-config.yaml --image=kindest/node:v1.24.0
kevin85421 marked this conversation as resolved.
Show resolved Hide resolved
```
The `kind-config.yaml` enables audit logging with the audit policy defined in `audit-policy.yaml`. The `audit-policy.yaml`
defines an auditing policy to listen to the Pod events in the namespace `pod-security`. With this policy, we can check
whether our Pods violate the policies in `restricted` standard or not.

The feature [Pod Security Admission](https://kubernetes.io/docs/concepts/security/pod-security-admission/) is firstly
introduced in Kubernetes v1.22 (alpha) and becomes stable in Kubernetes v1.25. In addition, KubeRay currently supports
Kubernetes from v1.19 to v1.24. (At the time of writing, we have not tested KubeRay with Kubernetes v1.25). Hence, I use **Kubernetes v1.24** in this step.

# Step 2: Check the audit logs
```bash
docker exec kind-control-plane cat /var/log/kubernetes/kube-apiserver-audit.log
```
The log should be empty because the namespace `pod-security` does not exist.

# Step 3: Create the `pod-security` namespace
```bash
kubectl create ns pod-security
kubectl label --overwrite ns pod-security \
pod-security.kubernetes.io/warn=restricted \
pod-security.kubernetes.io/warn-version=latest \
pod-security.kubernetes.io/audit=restricted \
pod-security.kubernetes.io/audit-version=latest \
pod-security.kubernetes.io/enforce=restricted \
pod-security.kubernetes.io/enforce-version=latest
```
With the `pod-security.kubernetes.io` labels, the built-in Kubernetes Pod security admission controller will apply the
`restricted` Pod security standard to all Pods in the namespace `pod-security`. The label
`pod-security.kubernetes.io/enforce=restricted` means that the Pod will be rejected if it violate the policies defined in
`restricted` security standard. See [Pod Security Admission](https://kubernetes.io/docs/concepts/security/pod-security-admission/) for more details about the labels.

# Step 4: Install the KubeRay operator
```bash
# Update the field securityContext in helm-chart/kuberay-operator/values.yaml
securityContext:
allowPrivilegeEscalation: false
capabilities:
drop: ["ALL"]
runAsNonRoot: true
seccompProfile:
type: RuntimeDefault

# Path: helm-chart/kuberay-operator
helm install -n pod-security kuberay-operator .
```

# Step 5: Create a RayCluster (Choose either Step 5.1 or Step 5.2)
* If you choose Step 5.1, no Pod will be created in the namespace `pod-security`.
* If you choose Step 5.2, Pods can be created successfully.

## Step 5.1: Create a RayCluster without proper `securityContext` configurations
```bash
# Path: ray-operator/config/samples
kubectl apply -n pod-security -f ray-cluster.complete.yaml

# Wait 20 seconds and check audit logs for the error messages.
docker exec kind-control-plane cat /var/log/kubernetes/kube-apiserver-audit.log

# Example error messagess
# "pods \"raycluster-complete-head-fkbf5\" is forbidden: violates PodSecurity \"restricted:latest\": allowPrivilegeEscalation != false (container \"ray-head\" must set securityContext.allowPrivilegeEscalation=false) ...

kubectl get pod -n pod-security
# NAME READY STATUS RESTARTS AGE
# kuberay-operator-8b6d55dbb-t8msf 1/1 Running 0 62s

# Clean up the RayCluster
kubectl delete rayclusters.ray.io -n pod-security raycluster-complete
# raycluster.ray.io "raycluster-complete" deleted
```
No Pod will be created in the namespace `pod-security`, and check audit logs for error messages.

kevin85421 marked this conversation as resolved.
Show resolved Hide resolved
## Step 5.2: Create a RayCluster with proper `securityContext` configurations
```bash
# Path: ray-operator/config/security
kubectl apply -n pod-security -f ray-cluster.pod-security.yaml

# Wait for the RayCluster convergence and check audit logs for the messages.
docker exec kind-control-plane cat /var/log/kubernetes/kube-apiserver-audit.log

# Forward the dashboard port
kubectl port-forward --address 0.0.0.0 svc/raycluster-pod-security-head-svc -n pod-security 8265:8265

# Log in to the head Pod
kubectl exec -it -n pod-security ${YOUR_HEAD_POD} -- bash

# (Head Pod) Run a sample job in the Pod
python3 samples/xgboost_example.py

# Check the job status in the dashboard on your browser.
# http://127.0.0.1:8265/#/job => The job status should be "SUCCEEDED".

# (Head Pod) Make sure Python dependencies can be installed under `restricted` security standard
pip3 install jsonpatch
echo $? # Check the exit code of `pip3 install jsonpatch`. It should be 0.

# Clean up the RayCluster
kubectl delete -n pod-security -f ray-cluster.pod-security.yaml
# raycluster.ray.io "raycluster-pod-security" deleted
# configmap "xgboost-example" deleted
```
One head Pod and one worker Pod will be created as specified in `ray-cluster.pod-security.yaml`.
First, we log in to the head Pod, run a XGBoost example script, and check the job
status in the dashboard. Next, we use `pip` to install a Python dependency (i.e. `jsonpatch`), and the exit code of the `pip` command should be 0.
4 changes: 4 additions & 0 deletions helm-chart/kuberay-operator/values.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -58,3 +58,7 @@ rbacEnable: true

batchScheduler:
enabled: false

# Set up `securityContext` to improve Pod security.
# See https://github.com/ray-project/kuberay/blob/master/docs/guidance/pod-security.md for further guidance.
securityContext: {}
3 changes: 1 addition & 2 deletions ray-operator/config/samples/ray-cluster.complete.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -11,8 +11,7 @@ metadata:
name: raycluster-complete
spec:
rayVersion: '2.1.0'
######################headGroupSpec#################################
# Ray head pod template and specs
# Ray head pod configuration
headGroupSpec:
# Kubernetes Service Type, valid values are 'ClusterIP', 'NodePort' and 'LoadBalancer'
serviceType: ClusterIP
Expand Down
15 changes: 15 additions & 0 deletions ray-operator/config/security/audit-policy.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
apiVersion: audit.k8s.io/v1 # This is required.
kind: Policy
# Don't generate audit events for all requests in RequestReceived stage.
omitStages:
- "RequestReceived"
rules:
# Log pod changes at RequestResponse level
- level: Metadata
resources:
- group: ""
# Resource "pods" doesn't match requests to any subresource of pods,
# which is consistent with the RBAC policy.
resources: ["pods"]
# This rule only applies to resources in the "pod-security" namespace.
namespaces: ["pod-security"]
29 changes: 29 additions & 0 deletions ray-operator/config/security/kind-config.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
nodes:
- role: control-plane
kubeadmConfigPatches:
- |
kind: ClusterConfiguration
apiServer:
# enable auditing flags on the API server
extraArgs:
audit-log-path: /var/log/kubernetes/kube-apiserver-audit.log
audit-policy-file: /etc/kubernetes/policies/audit-policy.yaml
# mount new files / directories on the control plane
extraVolumes:
- name: audit-policies
hostPath: /etc/kubernetes/policies
mountPath: /etc/kubernetes/policies
readOnly: true
pathType: "DirectoryOrCreate"
- name: "audit-logs"
hostPath: "/var/log/kubernetes"
mountPath: "/var/log/kubernetes"
readOnly: false
pathType: DirectoryOrCreate
# mount the local file on the control plane
extraMounts:
- hostPath: ./audit-policy.yaml
containerPath: /etc/kubernetes/policies/audit-policy.yaml
readOnly: true
175 changes: 175 additions & 0 deletions ray-operator/config/security/ray-cluster.pod-security.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,175 @@
# The resource requests and limits in this config are too small for production!
# For examples with more realistic resource configuration, see
# ray-cluster.complete.large.yaml and
# ray-cluster.autoscaler.large.yaml.
apiVersion: ray.io/v1alpha1
kind: RayCluster
metadata:
labels:
controller-tools.k8s.io: "1.0"
# A unique identifier for the head node and workers of this cluster.
name: raycluster-pod-security
spec:
rayVersion: '2.1.0'
# Ray head pod configuration
headGroupSpec:
# Kubernetes Service Type, valid values are 'ClusterIP', 'NodePort' and 'LoadBalancer'
serviceType: ClusterIP
# for the head group, replicas should always be 1.
# headGroupSpec.replicas is deprecated in KubeRay >= 0.3.0.
replicas: 1
# the following params are used to complete the ray start: ray start --head --block --dashboard-host: '0.0.0.0' ...
rayStartParams:
dashboard-host: '0.0.0.0'
block: 'true'
#pod template
template:
spec:
containers:
- name: ray-head
image: rayproject/ray-ml:2.1.0
ports:
- containerPort: 6379
name: gcs
- containerPort: 8265
name: dashboard
- containerPort: 10001
name: client
lifecycle:
preStop:
exec:
command: ["/bin/sh","-c","ray stop"]
volumeMounts:
- mountPath: /tmp/ray
name: ray-logs
- mountPath: /home/ray/samples
name: ray-example-configmap
resources:
limits:
cpu: 1
memory: 2Gi
requests:
cpu: 1
memory: 2Gi
securityContext:
allowPrivilegeEscalation: false
capabilities:
drop: ["ALL"]
runAsNonRoot: true
seccompProfile:
type: RuntimeDefault
volumes:
- name: ray-logs
emptyDir: {}
- name: ray-example-configmap
configMap:
name: ray-example
# An array of keys from the ConfigMap to create as files
items:
- key: xgboost_example.py
path: xgboost_example.py
workerGroupSpecs:
# the pod replicas in this group typed worker
- replicas: 1
minReplicas: 1
maxReplicas: 10
# logical group name, for this called large-group, also can be functional
groupName: large-group
# if worker pods need to be added, we can simply increment the replicas
# if worker pods need to be removed, we decrement the replicas, and populate the podsToDelete list
# the operator will remove pods from the list until the number of replicas is satisfied
# when a pod is confirmed to be deleted, its name will be removed from the list below
#scaleStrategy:
# workersToDelete:
# - raycluster-complete-worker-large-group-bdtwh
# - raycluster-complete-worker-large-group-hv457
# - raycluster-complete-worker-large-group-k8tj7
# the following params are used to complete the ray start: ray start --block
rayStartParams:
block: 'true'
#pod template
template:
spec:
containers:
- name: ray-worker
image: rayproject/ray-ml:2.1.0
# environment variables to set in the container.Optional.
# Refer to https://kubernetes.io/docs/tasks/inject-data-application/define-environment-variable-container/
lifecycle:
preStop:
exec:
command: ["/bin/sh","-c","ray stop"]
# use volumeMounts.Optional.
# Refer to https://kubernetes.io/docs/concepts/storage/volumes/
volumeMounts:
- mountPath: /tmp/ray
name: ray-logs
resources:
limits:
cpu: 4
memory: 2Gi
requests:
cpu: 1
memory: 2Gi
securityContext:
allowPrivilegeEscalation: false
capabilities:
drop: ["ALL"]
runAsNonRoot: true
seccompProfile:
type: RuntimeDefault
initContainers:
# the env var $RAY_IP is set by the operator if missing, with the value of the head service name
- name: init-myservice
image: busybox:1.28
# Change the cluster postfix if you don't have a default setting
command: ['sh', '-c', "until nslookup $RAY_IP.$(cat /var/run/secrets/kubernetes.io/serviceaccount/namespace).svc.cluster.local; do echo waiting for myservice; sleep 2; done"]
securityContext:
runAsUser: 1000
allowPrivilegeEscalation: false
capabilities:
drop: ["ALL"]
runAsNonRoot: true
seccompProfile:
type: RuntimeDefault
# use volumes
# Refer to https://kubernetes.io/docs/concepts/storage/volumes/
volumes:
- name: ray-logs
emptyDir: {}
---
apiVersion: v1
kind: ConfigMap
metadata:
name: ray-example
data:
xgboost_example.py: |
import ray
from ray.train.xgboost import XGBoostTrainer
from ray.air.config import ScalingConfig

# Load data.
dataset = ray.data.read_csv("s3://anonymous@air-example-data/breast_cancer.csv")

# Split data into train and validation.
train_dataset, valid_dataset = dataset.train_test_split(test_size=0.3)

trainer = XGBoostTrainer(
scaling_config=ScalingConfig(
# Number of workers to use for data parallelism.
num_workers=1,
# Whether to use GPU acceleration.
use_gpu=False,
),
label_column="target",
num_boost_round=20,
params={
# XGBoost specific params
"objective": "binary:logistic",
# "tree_method": "gpu_hist", # uncomment this to use GPU for training
"eval_metric": ["logloss", "error"],
},
datasets={"train": train_dataset, "valid": valid_dataset},
)
result = trainer.fit()
print(result.metrics)