-
Notifications
You must be signed in to change notification settings - Fork 402
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature] Better validation for the RayCluster CRD, via validating webhooks #736
Closed
1 of 2 tasks
Labels
Milestone
Comments
DmitriGekhtman
added
enhancement
New feature or request
P1
Issue that should be fixed within a few weeks
observability
labels
Nov 18, 2022
some insight from @Jeffwan
|
@kevin85421 has anyone picked up this issue? Open to contributions here? |
davidxia
added a commit
to davidxia/kuberay
that referenced
this issue
Oct 30, 2023
davidxia
added a commit
to davidxia/kuberay
that referenced
this issue
Oct 30, 2023
and validate worker group names are unique. closes ray-project#718 closes ray-project#736
davidxia
added a commit
to davidxia/kuberay
that referenced
this issue
Oct 30, 2023
and validate worker group names are unique. closes ray-project#718 closes ray-project#736
davidxia
added a commit
to davidxia/kuberay
that referenced
this issue
Oct 30, 2023
and validate worker group names are unique. closes ray-project#718 closes ray-project#736
davidxia
added a commit
to davidxia/kuberay
that referenced
this issue
Oct 30, 2023
and validate worker group names are unique. closes ray-project#718 closes ray-project#736
davidxia
added a commit
to davidxia/kuberay
that referenced
this issue
Oct 30, 2023
and validate worker group names are unique. closes ray-project#718 closes ray-project#736
davidxia
added a commit
to davidxia/kuberay
that referenced
this issue
Oct 30, 2023
and validate worker group names are unique. Much of the code was generated by running the command below as [documented in kubebuilder][1]. ``` kubebuilder create webhook \ --group ray \ --version v1 \ --kind RayCluster \ --defaulting \ --programmatic-validation` ``` ## Example RayCluster that has duplicate worker group names ```shell cat dupe-worker-group-name.yaml apiVersion: ray.io/v1 kind: RayCluster metadata: name: dupe-worker-group-name spec: headGroupSpec: rayStartParams: dashboard-host: '0.0.0.0' template: spec: containers: - name: ray-head image: rayproject/ray:2.7.0 workerGroupSpecs: - replicas: 1 minReplicas: 1 maxReplicas: 10 groupName: group1 rayStartParams: {} template: spec: containers: - name: ray-worker image: rayproject/ray:2.7.0 - replicas: 1 minReplicas: 1 maxReplicas: 10 groupName: group1 rayStartParams: {} template: spec: containers: - name: ray-worker image: rayproject/ray:2.7.0 ``` ## Before ``` kubectl apply -f dupe-worker-group-name.yaml raycluster.ray.io/raycluster-dupe-worker-name created ``` ## After ``` kubectl --context kind-kind apply -f config/samples/ray-cluster-dupe-worker-name.yaml (base) The RayCluster "raycluster-dupe-worker-name" is invalid: spec.workerGroupSpecs[1]: Invalid value: v1.WorkerGroupSpec{GroupName:"group1", Replicas:(*int32)(0x40006e63cc), MinReplicas:(*int32)(0x40006e63c8), MaxReplicas:(*int32)(0x40006e63c0), RayStartParams:map[string]string{}, Template:v1.PodTemplateSpec{ObjectMeta:v1.ObjectMeta{Name:"", GenerateName:"", Namespace:"", SelfLink:"", UID:"", ResourceVersion:"", Generation:0, CreationTimestamp:time.Date(1, time.January, 1, 0, 0, 0, 0, time.UTC), DeletionTimestamp:<nil>, DeletionGracePeriodSeconds:(*int64)(nil), Labels:map[string]string(nil), Annotations:map[string]string(nil), OwnerReferences:[]v1.OwnerReference(nil), Finalizers:[]string(nil), ClusterName:"", ManagedFields:[]v1.ManagedFieldsEntry(nil)}, Spec:v1.PodSpec{Volumes:[]v1.Volume(nil), InitContainers:[]v1.Container(nil), Containers:[]v1.Container{v1.Container{Name:"ray-worker", Image:"rayproject/ray:2.7.0", Command:[]string(nil), Args:[]string(nil), WorkingDir:"", Ports:[]v1.ContainerPort(nil), EnvFrom:[]v1.EnvFromSource(nil), Env:[]v1.EnvVar(nil), Resources:v1.ResourceRequirements{Limits:v1.ResourceList(nil), Requests:v1.ResourceList(nil)}, VolumeMounts:[]v1.VolumeMount(nil), VolumeDevices:[]v1.VolumeDevice(nil), LivenessProbe:(*v1.Probe)(nil), ReadinessProbe:(*v1.Probe)(nil), StartupProbe:(*v1.Probe)(nil), Lifecycle:(*v1.Lifecycle)(nil), TerminationMessagePath:"", TerminationMessagePolicy:"", ImagePullPolicy:"", SecurityContext:(*v1.SecurityContext)(nil), Stdin:false, StdinOnce:false, TTY:false}}, EphemeralContainers:[]v1.EphemeralContainer(nil), RestartPolicy:"", TerminationGracePeriodSeconds:(*int64)(nil), ActiveDeadlineSeconds:(*int64)(nil), DNSPolicy:"", NodeSelector:map[string]string(nil), ServiceAccountName:"", DeprecatedServiceAccount:"", AutomountServiceAccountToken:(*bool)(nil), NodeName:"", HostNetwork:false, HostPID:false, HostIPC:false, ShareProcessNamespace:(*bool)(nil), SecurityContext:(*v1.PodSecurityContext)(nil), ImagePullSecrets:[]v1.LocalObjectReference(nil), Hostname:"", Subdomain:"", Affinity:(*v1.Affinity)(nil), SchedulerName:"", Tolerations:[]v1.Toleration(nil), HostAliases:[]v1.HostAlias(nil), PriorityClassName:"", Priority:(*int32)(nil), DNSConfig:(*v1.PodDNSConfig)(nil), ReadinessGates:[]v1.PodReadinessGate(nil), RuntimeClassName:(*string)(nil), EnableServiceLinks:(*bool)(nil), PreemptionPolicy:(*v1.PreemptionPolicy)(nil), Overhead:v1.ResourceList(nil), TopologySpreadConstraints:[]v1.TopologySpreadConstraint(nil), SetHostnameAsFQDN:(*bool)(nil), OS:(*v1.PodOS)(nil)}}, ScaleStrategy:v1.ScaleStrategy{WorkersToDelete:[]string(nil)}}: worker group names must be unique ``` closes ray-project#718 closes ray-project#736 [1]: https://book.kubebuilder.io/cronjob-tutorial/webhook-implementation
davidxia
added a commit
to davidxia/kuberay
that referenced
this issue
Oct 30, 2023
and validate worker group names are unique. Much of the code was generated by running the command below as [documented in kubebuilder][1]. ``` kubebuilder create webhook \ --group ray \ --version v1 \ --kind RayCluster \ --defaulting \ --programmatic-validation` ``` ## Example RayCluster that has duplicate worker group names ```shell cat dupe-worker-group-name.yaml apiVersion: ray.io/v1 kind: RayCluster metadata: name: dupe-worker-group-name spec: headGroupSpec: rayStartParams: dashboard-host: '0.0.0.0' template: spec: containers: - name: ray-head image: rayproject/ray:2.7.0 workerGroupSpecs: - replicas: 1 minReplicas: 1 maxReplicas: 10 groupName: group1 rayStartParams: {} template: spec: containers: - name: ray-worker image: rayproject/ray:2.7.0 - replicas: 1 minReplicas: 1 maxReplicas: 10 groupName: group1 rayStartParams: {} template: spec: containers: - name: ray-worker image: rayproject/ray:2.7.0 ``` ## Before ``` kubectl apply -f dupe-worker-group-name.yaml raycluster.ray.io/raycluster-dupe-worker-name created ``` ## After ``` kubectl --context kind-kind apply -f config/samples/ray-cluster-dupe-worker-name.yaml (base) The RayCluster "raycluster-dupe-worker-name" is invalid: spec.workerGroupSpecs[1]: Invalid value: v1.WorkerGroupSpec{GroupName:"group1", Replicas:(*int32)(0x40006e63cc), MinReplicas:(*int32)(0x40006e63c8), MaxReplicas:(*int32)(0x40006e63c0), RayStartParams:map[string]string{}, Template:v1.PodTemplateSpec{ObjectMeta:v1.ObjectMeta{Name:"", GenerateName:"", Namespace:"", SelfLink:"", UID:"", ResourceVersion:"", Generation:0, CreationTimestamp:time.Date(1, time.January, 1, 0, 0, 0, 0, time.UTC), DeletionTimestamp:<nil>, DeletionGracePeriodSeconds:(*int64)(nil), Labels:map[string]string(nil), Annotations:map[string]string(nil), OwnerReferences:[]v1.OwnerReference(nil), Finalizers:[]string(nil), ClusterName:"", ManagedFields:[]v1.ManagedFieldsEntry(nil)}, Spec:v1.PodSpec{Volumes:[]v1.Volume(nil), InitContainers:[]v1.Container(nil), Containers:[]v1.Container{v1.Container{Name:"ray-worker", Image:"rayproject/ray:2.7.0", Command:[]string(nil), Args:[]string(nil), WorkingDir:"", Ports:[]v1.ContainerPort(nil), EnvFrom:[]v1.EnvFromSource(nil), Env:[]v1.EnvVar(nil), Resources:v1.ResourceRequirements{Limits:v1.ResourceList(nil), Requests:v1.ResourceList(nil)}, VolumeMounts:[]v1.VolumeMount(nil), VolumeDevices:[]v1.VolumeDevice(nil), LivenessProbe:(*v1.Probe)(nil), ReadinessProbe:(*v1.Probe)(nil), StartupProbe:(*v1.Probe)(nil), Lifecycle:(*v1.Lifecycle)(nil), TerminationMessagePath:"", TerminationMessagePolicy:"", ImagePullPolicy:"", SecurityContext:(*v1.SecurityContext)(nil), Stdin:false, StdinOnce:false, TTY:false}}, EphemeralContainers:[]v1.EphemeralContainer(nil), RestartPolicy:"", TerminationGracePeriodSeconds:(*int64)(nil), ActiveDeadlineSeconds:(*int64)(nil), DNSPolicy:"", NodeSelector:map[string]string(nil), ServiceAccountName:"", DeprecatedServiceAccount:"", AutomountServiceAccountToken:(*bool)(nil), NodeName:"", HostNetwork:false, HostPID:false, HostIPC:false, ShareProcessNamespace:(*bool)(nil), SecurityContext:(*v1.PodSecurityContext)(nil), ImagePullSecrets:[]v1.LocalObjectReference(nil), Hostname:"", Subdomain:"", Affinity:(*v1.Affinity)(nil), SchedulerName:"", Tolerations:[]v1.Toleration(nil), HostAliases:[]v1.HostAlias(nil), PriorityClassName:"", Priority:(*int32)(nil), DNSConfig:(*v1.PodDNSConfig)(nil), ReadinessGates:[]v1.PodReadinessGate(nil), RuntimeClassName:(*string)(nil), EnableServiceLinks:(*bool)(nil), PreemptionPolicy:(*v1.PreemptionPolicy)(nil), Overhead:v1.ResourceList(nil), TopologySpreadConstraints:[]v1.TopologySpreadConstraint(nil), SetHostnameAsFQDN:(*bool)(nil), OS:(*v1.PodOS)(nil)}}, ScaleStrategy:v1.ScaleStrategy{WorkersToDelete:[]string(nil)}}: worker group names must be unique ``` closes ray-project#718 closes ray-project#736 [1]: https://book.kubebuilder.io/cronjob-tutorial/webhook-implementation
davidxia
added a commit
to davidxia/kuberay
that referenced
this issue
Oct 31, 2023
and validate worker group names are unique. Much of the code was generated by running the command below as [documented in kubebuilder][1]. ``` kubebuilder create webhook \ --group ray \ --version v1 \ --kind RayCluster \ --defaulting \ --programmatic-validation` ``` ## Example RayCluster that has duplicate worker group names ```shell cat dupe-worker-group-name.yaml apiVersion: ray.io/v1 kind: RayCluster metadata: name: dupe-worker-group-name spec: headGroupSpec: rayStartParams: dashboard-host: '0.0.0.0' template: spec: containers: - name: ray-head image: rayproject/ray:2.7.0 workerGroupSpecs: - replicas: 1 minReplicas: 1 maxReplicas: 10 groupName: group1 rayStartParams: {} template: spec: containers: - name: ray-worker image: rayproject/ray:2.7.0 - replicas: 1 minReplicas: 1 maxReplicas: 10 groupName: group1 rayStartParams: {} template: spec: containers: - name: ray-worker image: rayproject/ray:2.7.0 ``` ## Before ``` kubectl apply -f dupe-worker-group-name.yaml raycluster.ray.io/raycluster-dupe-worker-name created ``` ## After ``` kubectl --context kind-kind apply -f config/samples/ray-cluster-dupe-worker-name.yaml (base) ``` `The RayCluster "raycluster-dupe-worker-name" is invalid: spec.workerGroupSpecs[1]: Invalid value: v1.WorkerGroupSpec{GroupName:"group1", Replicas:(*int32)(0x40006e63cc), MinReplicas:(*int32)(0x40006e63c8), MaxReplicas:(*int32)(0x40006e63c0), RayStartParams:map[string]string{}, Template:v1.PodTemplateSpec{ObjectMeta:v1.ObjectMeta{Name:"", GenerateName:"", Namespace:"", SelfLink:"", UID:"", ResourceVersion:"", Generation:0, CreationTimestamp:time.Date(1, time.January, 1, 0, 0, 0, 0, time.UTC), DeletionTimestamp:<nil>, DeletionGracePeriodSeconds:(*int64)(nil), Labels:map[string]string(nil), Annotations:map[string]string(nil), OwnerReferences:[]v1.OwnerReference(nil), Finalizers:[]string(nil), ClusterName:"", ManagedFields:[]v1.ManagedFieldsEntry(nil)}, Spec:v1.PodSpec{Volumes:[]v1.Volume(nil), InitContainers:[]v1.Container(nil), Containers:[]v1.Container{v1.Container{Name:"ray-worker", Image:"rayproject/ray:2.7.0", Command:[]string(nil), Args:[]string(nil), WorkingDir:"", Ports:[]v1.ContainerPort(nil), EnvFrom:[]v1.EnvFromSource(nil), Env:[]v1.EnvVar(nil), Resources:v1.ResourceRequirements{Limits:v1.ResourceList(nil), Requests:v1.ResourceList(nil)}, VolumeMounts:[]v1.VolumeMount(nil), VolumeDevices:[]v1.VolumeDevice(nil), LivenessProbe:(*v1.Probe)(nil), ReadinessProbe:(*v1.Probe)(nil), StartupProbe:(*v1.Probe)(nil), Lifecycle:(*v1.Lifecycle)(nil), TerminationMessagePath:"", TerminationMessagePolicy:"", ImagePullPolicy:"", SecurityContext:(*v1.SecurityContext)(nil), Stdin:false, StdinOnce:false, TTY:false}}, EphemeralContainers:[]v1.EphemeralContainer(nil), RestartPolicy:"", TerminationGracePeriodSeconds:(*int64)(nil), ActiveDeadlineSeconds:(*int64)(nil), DNSPolicy:"", NodeSelector:map[string]string(nil), ServiceAccountName:"", DeprecatedServiceAccount:"", AutomountServiceAccountToken:(*bool)(nil), NodeName:"", HostNetwork:false, HostPID:false, HostIPC:false, ShareProcessNamespace:(*bool)(nil), SecurityContext:(*v1.PodSecurityContext)(nil), ImagePullSecrets:[]v1.LocalObjectReference(nil), Hostname:"", Subdomain:"", Affinity:(*v1.Affinity)(nil), SchedulerName:"", Tolerations:[]v1.Toleration(nil), HostAliases:[]v1.HostAlias(nil), PriorityClassName:"", Priority:(*int32)(nil), DNSConfig:(*v1.PodDNSConfig)(nil), ReadinessGates:[]v1.PodReadinessGate(nil), RuntimeClassName:(*string)(nil), EnableServiceLinks:(*bool)(nil), PreemptionPolicy:(*v1.PreemptionPolicy)(nil), Overhead:v1.ResourceList(nil), TopologySpreadConstraints:[]v1.TopologySpreadConstraint(nil), SetHostnameAsFQDN:(*bool)(nil), OS:(*v1.PodOS)(nil)}}, ScaleStrategy:v1.ScaleStrategy{WorkersToDelete:[]string(nil)}}: worker group names must be unique` closes ray-project#718 closes ray-project#736 [1]: https://book.kubebuilder.io/cronjob-tutorial/webhook-implementation
davidxia
added a commit
to davidxia/kuberay
that referenced
this issue
Oct 31, 2023
and validate worker group names are unique. Much of the code was generated by running the command below as [documented in kubebuilder][1]. ``` kubebuilder create webhook \ --group ray \ --version v1 \ --kind RayCluster \ --defaulting \ --programmatic-validation` ``` ## Example RayCluster that has duplicate worker group names ```shell cat dupe-worker-group-name.yaml apiVersion: ray.io/v1 kind: RayCluster metadata: name: dupe-worker-group-name spec: headGroupSpec: rayStartParams: dashboard-host: '0.0.0.0' template: spec: containers: - name: ray-head image: rayproject/ray:2.7.0 workerGroupSpecs: - replicas: 1 minReplicas: 1 maxReplicas: 10 groupName: group1 rayStartParams: {} template: spec: containers: - name: ray-worker image: rayproject/ray:2.7.0 - replicas: 1 minReplicas: 1 maxReplicas: 10 groupName: group1 rayStartParams: {} template: spec: containers: - name: ray-worker image: rayproject/ray:2.7.0 ``` ## Before ``` kubectl apply -f dupe-worker-group-name.yaml raycluster.ray.io/raycluster-dupe-worker-name created ``` ## After ``` kubectl --context kind-kind apply -f config/samples/ray-cluster-dupe-worker-name.yaml (base) ``` `The RayCluster "raycluster-dupe-worker-name" is invalid: spec.workerGroupSpecs[1]: Invalid value: v1.WorkerGroupSpec{GroupName:"group1", Replicas:(*int32)(0x40006e63cc), MinReplicas:(*int32)(0x40006e63c8), MaxReplicas:(*int32)(0x40006e63c0), RayStartParams:map[string]string{}, Template:v1.PodTemplateSpec{ObjectMeta:v1.ObjectMeta{Name:"", GenerateName:"", Namespace:"", SelfLink:"", UID:"", ResourceVersion:"", Generation:0, CreationTimestamp:time.Date(1, time.January, 1, 0, 0, 0, 0, time.UTC), DeletionTimestamp:<nil>, DeletionGracePeriodSeconds:(*int64)(nil), Labels:map[string]string(nil), Annotations:map[string]string(nil), OwnerReferences:[]v1.OwnerReference(nil), Finalizers:[]string(nil), ClusterName:"", ManagedFields:[]v1.ManagedFieldsEntry(nil)}, Spec:v1.PodSpec{Volumes:[]v1.Volume(nil), InitContainers:[]v1.Container(nil), Containers:[]v1.Container{v1.Container{Name:"ray-worker", Image:"rayproject/ray:2.7.0", Command:[]string(nil), Args:[]string(nil), WorkingDir:"", Ports:[]v1.ContainerPort(nil), EnvFrom:[]v1.EnvFromSource(nil), Env:[]v1.EnvVar(nil), Resources:v1.ResourceRequirements{Limits:v1.ResourceList(nil), Requests:v1.ResourceList(nil)}, VolumeMounts:[]v1.VolumeMount(nil), VolumeDevices:[]v1.VolumeDevice(nil), LivenessProbe:(*v1.Probe)(nil), ReadinessProbe:(*v1.Probe)(nil), StartupProbe:(*v1.Probe)(nil), Lifecycle:(*v1.Lifecycle)(nil), TerminationMessagePath:"", TerminationMessagePolicy:"", ImagePullPolicy:"", SecurityContext:(*v1.SecurityContext)(nil), Stdin:false, StdinOnce:false, TTY:false}}, EphemeralContainers:[]v1.EphemeralContainer(nil), RestartPolicy:"", TerminationGracePeriodSeconds:(*int64)(nil), ActiveDeadlineSeconds:(*int64)(nil), DNSPolicy:"", NodeSelector:map[string]string(nil), ServiceAccountName:"", DeprecatedServiceAccount:"", AutomountServiceAccountToken:(*bool)(nil), NodeName:"", HostNetwork:false, HostPID:false, HostIPC:false, ShareProcessNamespace:(*bool)(nil), SecurityContext:(*v1.PodSecurityContext)(nil), ImagePullSecrets:[]v1.LocalObjectReference(nil), Hostname:"", Subdomain:"", Affinity:(*v1.Affinity)(nil), SchedulerName:"", Tolerations:[]v1.Toleration(nil), HostAliases:[]v1.HostAlias(nil), PriorityClassName:"", Priority:(*int32)(nil), DNSConfig:(*v1.PodDNSConfig)(nil), ReadinessGates:[]v1.PodReadinessGate(nil), RuntimeClassName:(*string)(nil), EnableServiceLinks:(*bool)(nil), PreemptionPolicy:(*v1.PreemptionPolicy)(nil), Overhead:v1.ResourceList(nil), TopologySpreadConstraints:[]v1.TopologySpreadConstraint(nil), SetHostnameAsFQDN:(*bool)(nil), OS:(*v1.PodOS)(nil)}}, ScaleStrategy:v1.ScaleStrategy{WorkersToDelete:[]string(nil)}}: worker group names must be unique` closes ray-project#718 closes ray-project#736 [1]: https://book.kubebuilder.io/cronjob-tutorial/webhook-implementation
davidxia
added a commit
to davidxia/kuberay
that referenced
this issue
Oct 31, 2023
and validate worker group names are unique. Much of the code was generated by running the command below as [documented in kubebuilder][1]. ``` kubebuilder create webhook \ --group ray \ --version v1 \ --kind RayCluster \ --defaulting \ --programmatic-validation` ``` ## How to use locally ```shell IMG=kuberay/operator:test make docker-build kind load docker-image kuberay/operator:test IMG=kuberay/operator:test make deploy-with-webhooks ``` ## Example RayCluster that has duplicate worker group names ```shell cat dupe-worker-group-name.yaml apiVersion: ray.io/v1 kind: RayCluster metadata: name: dupe-worker-group-name spec: headGroupSpec: rayStartParams: dashboard-host: '0.0.0.0' template: spec: containers: - name: ray-head image: rayproject/ray:2.7.0 workerGroupSpecs: - replicas: 1 minReplicas: 1 maxReplicas: 10 groupName: group1 rayStartParams: {} template: spec: containers: - name: ray-worker image: rayproject/ray:2.7.0 - replicas: 1 minReplicas: 1 maxReplicas: 10 groupName: group1 rayStartParams: {} template: spec: containers: - name: ray-worker image: rayproject/ray:2.7.0 ``` ## Before ``` kubectl apply -f dupe-worker-group-name.yaml raycluster.ray.io/raycluster-dupe-worker-name created ``` ## After ``` kubectl --context kind-kind apply -f config/samples/ray-cluster-dupe-worker-name.yaml (base) ``` `The RayCluster "raycluster-dupe-worker-name" is invalid: spec.workerGroupSpecs[1]: Invalid value: v1.WorkerGroupSpec{GroupName:"group1", Replicas:(*int32)(0x40006e63cc), MinReplicas:(*int32)(0x40006e63c8), MaxReplicas:(*int32)(0x40006e63c0), RayStartParams:map[string]string{}, Template:v1.PodTemplateSpec{ObjectMeta:v1.ObjectMeta{Name:"", GenerateName:"", Namespace:"", SelfLink:"", UID:"", ResourceVersion:"", Generation:0, CreationTimestamp:time.Date(1, time.January, 1, 0, 0, 0, 0, time.UTC), DeletionTimestamp:<nil>, DeletionGracePeriodSeconds:(*int64)(nil), Labels:map[string]string(nil), Annotations:map[string]string(nil), OwnerReferences:[]v1.OwnerReference(nil), Finalizers:[]string(nil), ClusterName:"", ManagedFields:[]v1.ManagedFieldsEntry(nil)}, Spec:v1.PodSpec{Volumes:[]v1.Volume(nil), InitContainers:[]v1.Container(nil), Containers:[]v1.Container{v1.Container{Name:"ray-worker", Image:"rayproject/ray:2.7.0", Command:[]string(nil), Args:[]string(nil), WorkingDir:"", Ports:[]v1.ContainerPort(nil), EnvFrom:[]v1.EnvFromSource(nil), Env:[]v1.EnvVar(nil), Resources:v1.ResourceRequirements{Limits:v1.ResourceList(nil), Requests:v1.ResourceList(nil)}, VolumeMounts:[]v1.VolumeMount(nil), VolumeDevices:[]v1.VolumeDevice(nil), LivenessProbe:(*v1.Probe)(nil), ReadinessProbe:(*v1.Probe)(nil), StartupProbe:(*v1.Probe)(nil), Lifecycle:(*v1.Lifecycle)(nil), TerminationMessagePath:"", TerminationMessagePolicy:"", ImagePullPolicy:"", SecurityContext:(*v1.SecurityContext)(nil), Stdin:false, StdinOnce:false, TTY:false}}, EphemeralContainers:[]v1.EphemeralContainer(nil), RestartPolicy:"", TerminationGracePeriodSeconds:(*int64)(nil), ActiveDeadlineSeconds:(*int64)(nil), DNSPolicy:"", NodeSelector:map[string]string(nil), ServiceAccountName:"", DeprecatedServiceAccount:"", AutomountServiceAccountToken:(*bool)(nil), NodeName:"", HostNetwork:false, HostPID:false, HostIPC:false, ShareProcessNamespace:(*bool)(nil), SecurityContext:(*v1.PodSecurityContext)(nil), ImagePullSecrets:[]v1.LocalObjectReference(nil), Hostname:"", Subdomain:"", Affinity:(*v1.Affinity)(nil), SchedulerName:"", Tolerations:[]v1.Toleration(nil), HostAliases:[]v1.HostAlias(nil), PriorityClassName:"", Priority:(*int32)(nil), DNSConfig:(*v1.PodDNSConfig)(nil), ReadinessGates:[]v1.PodReadinessGate(nil), RuntimeClassName:(*string)(nil), EnableServiceLinks:(*bool)(nil), PreemptionPolicy:(*v1.PreemptionPolicy)(nil), Overhead:v1.ResourceList(nil), TopologySpreadConstraints:[]v1.TopologySpreadConstraint(nil), SetHostnameAsFQDN:(*bool)(nil), OS:(*v1.PodOS)(nil)}}, ScaleStrategy:v1.ScaleStrategy{WorkersToDelete:[]string(nil)}}: worker group names must be unique` closes ray-project#718 closes ray-project#736 [1]: https://book.kubebuilder.io/cronjob-tutorial/webhook-implementation
davidxia
added a commit
to davidxia/kuberay
that referenced
this issue
Oct 31, 2023
and validate worker group names are unique. Much of the code was generated by running the command below as [documented in kubebuilder][1]. ``` kubebuilder create webhook \ --group ray \ --version v1 \ --kind RayCluster \ --defaulting \ --programmatic-validation` ``` ## How to use locally ```shell IMG=kuberay/operator:test make docker-build kind load docker-image kuberay/operator:test IMG=kuberay/operator:test make deploy-with-webhooks ``` ## Example RayCluster that has duplicate worker group names ```shell cat dupe-worker-group-name.yaml apiVersion: ray.io/v1 kind: RayCluster metadata: name: dupe-worker-group-name spec: headGroupSpec: rayStartParams: dashboard-host: '0.0.0.0' template: spec: containers: - name: ray-head image: rayproject/ray:2.7.0 workerGroupSpecs: - replicas: 1 minReplicas: 1 maxReplicas: 10 groupName: group1 rayStartParams: {} template: spec: containers: - name: ray-worker image: rayproject/ray:2.7.0 - replicas: 1 minReplicas: 1 maxReplicas: 10 groupName: group1 rayStartParams: {} template: spec: containers: - name: ray-worker image: rayproject/ray:2.7.0 ``` ## Before ``` kubectl apply -f dupe-worker-group-name.yaml raycluster.ray.io/raycluster-dupe-worker-name created ``` ## After ``` kubectl --context kind-kind apply -f config/samples/ray-cluster-dupe-worker-name.yaml (base) ``` `The RayCluster "raycluster-dupe-worker-name" is invalid: spec.workerGroupSpecs[1]: Invalid value: v1.WorkerGroupSpec{GroupName:"group1", Replicas:(*int32)(0x40006e63cc), MinReplicas:(*int32)(0x40006e63c8), MaxReplicas:(*int32)(0x40006e63c0), RayStartParams:map[string]string{}, Template:v1.PodTemplateSpec{ObjectMeta:v1.ObjectMeta{Name:"", GenerateName:"", Namespace:"", SelfLink:"", UID:"", ResourceVersion:"", Generation:0, CreationTimestamp:time.Date(1, time.January, 1, 0, 0, 0, 0, time.UTC), DeletionTimestamp:<nil>, DeletionGracePeriodSeconds:(*int64)(nil), Labels:map[string]string(nil), Annotations:map[string]string(nil), OwnerReferences:[]v1.OwnerReference(nil), Finalizers:[]string(nil), ClusterName:"", ManagedFields:[]v1.ManagedFieldsEntry(nil)}, Spec:v1.PodSpec{Volumes:[]v1.Volume(nil), InitContainers:[]v1.Container(nil), Containers:[]v1.Container{v1.Container{Name:"ray-worker", Image:"rayproject/ray:2.7.0", Command:[]string(nil), Args:[]string(nil), WorkingDir:"", Ports:[]v1.ContainerPort(nil), EnvFrom:[]v1.EnvFromSource(nil), Env:[]v1.EnvVar(nil), Resources:v1.ResourceRequirements{Limits:v1.ResourceList(nil), Requests:v1.ResourceList(nil)}, VolumeMounts:[]v1.VolumeMount(nil), VolumeDevices:[]v1.VolumeDevice(nil), LivenessProbe:(*v1.Probe)(nil), ReadinessProbe:(*v1.Probe)(nil), StartupProbe:(*v1.Probe)(nil), Lifecycle:(*v1.Lifecycle)(nil), TerminationMessagePath:"", TerminationMessagePolicy:"", ImagePullPolicy:"", SecurityContext:(*v1.SecurityContext)(nil), Stdin:false, StdinOnce:false, TTY:false}}, EphemeralContainers:[]v1.EphemeralContainer(nil), RestartPolicy:"", TerminationGracePeriodSeconds:(*int64)(nil), ActiveDeadlineSeconds:(*int64)(nil), DNSPolicy:"", NodeSelector:map[string]string(nil), ServiceAccountName:"", DeprecatedServiceAccount:"", AutomountServiceAccountToken:(*bool)(nil), NodeName:"", HostNetwork:false, HostPID:false, HostIPC:false, ShareProcessNamespace:(*bool)(nil), SecurityContext:(*v1.PodSecurityContext)(nil), ImagePullSecrets:[]v1.LocalObjectReference(nil), Hostname:"", Subdomain:"", Affinity:(*v1.Affinity)(nil), SchedulerName:"", Tolerations:[]v1.Toleration(nil), HostAliases:[]v1.HostAlias(nil), PriorityClassName:"", Priority:(*int32)(nil), DNSConfig:(*v1.PodDNSConfig)(nil), ReadinessGates:[]v1.PodReadinessGate(nil), RuntimeClassName:(*string)(nil), EnableServiceLinks:(*bool)(nil), PreemptionPolicy:(*v1.PreemptionPolicy)(nil), Overhead:v1.ResourceList(nil), TopologySpreadConstraints:[]v1.TopologySpreadConstraint(nil), SetHostnameAsFQDN:(*bool)(nil), OS:(*v1.PodOS)(nil)}}, ScaleStrategy:v1.ScaleStrategy{WorkersToDelete:[]string(nil)}}: worker group names must be unique` closes ray-project#718 closes ray-project#736 [1]: https://book.kubebuilder.io/cronjob-tutorial/webhook-implementation
davidxia
added a commit
to davidxia/kuberay
that referenced
this issue
Oct 31, 2023
and validate worker group names are unique. Much of the code was generated by running the command below as [documented in kubebuilder][1]. ``` kubebuilder create webhook \ --group ray \ --version v1 \ --kind RayCluster \ --defaulting \ --programmatic-validation` ``` ## How to use locally ```shell IMG=kuberay/operator:test make docker-build kind load docker-image kuberay/operator:test IMG=kuberay/operator:test make deploy-with-webhooks ``` ## Example RayCluster that has duplicate worker group names ```shell cat dupe-worker-group-name.yaml apiVersion: ray.io/v1 kind: RayCluster metadata: name: dupe-worker-group-name spec: headGroupSpec: rayStartParams: dashboard-host: '0.0.0.0' template: spec: containers: - name: ray-head image: rayproject/ray:2.7.0 workerGroupSpecs: - replicas: 1 minReplicas: 1 maxReplicas: 10 groupName: group1 rayStartParams: {} template: spec: containers: - name: ray-worker image: rayproject/ray:2.7.0 - replicas: 1 minReplicas: 1 maxReplicas: 10 groupName: group1 rayStartParams: {} template: spec: containers: - name: ray-worker image: rayproject/ray:2.7.0 ``` ## Before ``` kubectl apply -f dupe-worker-group-name.yaml raycluster.ray.io/raycluster-dupe-worker-name created ``` ## After ``` kubectl --context kind-kind apply -f config/samples/ray-cluster-dupe-worker-name.yaml (base) ``` `The RayCluster "raycluster-dupe-worker-name" is invalid: spec.workerGroupSpecs[1]: Invalid value: v1.WorkerGroupSpec{GroupName:"group1", Replicas:(*int32)(0x40006e63cc), MinReplicas:(*int32)(0x40006e63c8), MaxReplicas:(*int32)(0x40006e63c0), RayStartParams:map[string]string{}, Template:v1.PodTemplateSpec{ObjectMeta:v1.ObjectMeta{Name:"", GenerateName:"", Namespace:"", SelfLink:"", UID:"", ResourceVersion:"", Generation:0, CreationTimestamp:time.Date(1, time.January, 1, 0, 0, 0, 0, time.UTC), DeletionTimestamp:<nil>, DeletionGracePeriodSeconds:(*int64)(nil), Labels:map[string]string(nil), Annotations:map[string]string(nil), OwnerReferences:[]v1.OwnerReference(nil), Finalizers:[]string(nil), ClusterName:"", ManagedFields:[]v1.ManagedFieldsEntry(nil)}, Spec:v1.PodSpec{Volumes:[]v1.Volume(nil), InitContainers:[]v1.Container(nil), Containers:[]v1.Container{v1.Container{Name:"ray-worker", Image:"rayproject/ray:2.7.0", Command:[]string(nil), Args:[]string(nil), WorkingDir:"", Ports:[]v1.ContainerPort(nil), EnvFrom:[]v1.EnvFromSource(nil), Env:[]v1.EnvVar(nil), Resources:v1.ResourceRequirements{Limits:v1.ResourceList(nil), Requests:v1.ResourceList(nil)}, VolumeMounts:[]v1.VolumeMount(nil), VolumeDevices:[]v1.VolumeDevice(nil), LivenessProbe:(*v1.Probe)(nil), ReadinessProbe:(*v1.Probe)(nil), StartupProbe:(*v1.Probe)(nil), Lifecycle:(*v1.Lifecycle)(nil), TerminationMessagePath:"", TerminationMessagePolicy:"", ImagePullPolicy:"", SecurityContext:(*v1.SecurityContext)(nil), Stdin:false, StdinOnce:false, TTY:false}}, EphemeralContainers:[]v1.EphemeralContainer(nil), RestartPolicy:"", TerminationGracePeriodSeconds:(*int64)(nil), ActiveDeadlineSeconds:(*int64)(nil), DNSPolicy:"", NodeSelector:map[string]string(nil), ServiceAccountName:"", DeprecatedServiceAccount:"", AutomountServiceAccountToken:(*bool)(nil), NodeName:"", HostNetwork:false, HostPID:false, HostIPC:false, ShareProcessNamespace:(*bool)(nil), SecurityContext:(*v1.PodSecurityContext)(nil), ImagePullSecrets:[]v1.LocalObjectReference(nil), Hostname:"", Subdomain:"", Affinity:(*v1.Affinity)(nil), SchedulerName:"", Tolerations:[]v1.Toleration(nil), HostAliases:[]v1.HostAlias(nil), PriorityClassName:"", Priority:(*int32)(nil), DNSConfig:(*v1.PodDNSConfig)(nil), ReadinessGates:[]v1.PodReadinessGate(nil), RuntimeClassName:(*string)(nil), EnableServiceLinks:(*bool)(nil), PreemptionPolicy:(*v1.PreemptionPolicy)(nil), Overhead:v1.ResourceList(nil), TopologySpreadConstraints:[]v1.TopologySpreadConstraint(nil), SetHostnameAsFQDN:(*bool)(nil), OS:(*v1.PodOS)(nil)}}, ScaleStrategy:v1.ScaleStrategy{WorkersToDelete:[]string(nil)}}: worker group names must be unique` closes ray-project#718 closes ray-project#736 [1]: https://book.kubebuilder.io/cronjob-tutorial/webhook-implementation
davidxia
added a commit
to davidxia/kuberay
that referenced
this issue
Oct 31, 2023
and validate worker group names are unique. Add new Makefile targets `install-with-webhooks`, `uninstall-with-webhooks`, `deploy-with-webhooks`, and `undeploy-with-webhooks` to be backwards compatible and opt-in. Much of the code was generated by running the command below as [documented in kubebuilder][1]. ``` kubebuilder create webhook \ --group ray \ --version v1 \ --kind RayCluster \ --defaulting \ --programmatic-validation` ``` ## How to use locally ```shell IMG=kuberay/operator:test make docker-build kind load docker-image kuberay/operator:test IMG=kuberay/operator:test make deploy-with-webhooks ``` ## Example RayCluster that has duplicate worker group names ```shell cat dupe-worker-group-name.yaml apiVersion: ray.io/v1 kind: RayCluster metadata: name: dupe-worker-group-name spec: headGroupSpec: rayStartParams: dashboard-host: '0.0.0.0' template: spec: containers: - name: ray-head image: rayproject/ray:2.7.0 workerGroupSpecs: - replicas: 1 minReplicas: 1 maxReplicas: 10 groupName: group1 rayStartParams: {} template: spec: containers: - name: ray-worker image: rayproject/ray:2.7.0 - replicas: 1 minReplicas: 1 maxReplicas: 10 groupName: group1 rayStartParams: {} template: spec: containers: - name: ray-worker image: rayproject/ray:2.7.0 ``` ## Before ``` kubectl apply -f dupe-worker-group-name.yaml raycluster.ray.io/raycluster-dupe-worker-name created ``` ## After ``` kubectl --context kind-kind apply -f config/samples/ray-cluster-dupe-worker-name.yaml ``` `The RayCluster "raycluster-dupe-worker-name" is invalid: spec.workerGroupSpecs[1]: Invalid value: v1.WorkerGroupSpec{GroupName:"group1", Replicas:(*int32)(0x40006e63cc), MinReplicas:(*int32)(0x40006e63c8), MaxReplicas:(*int32)(0x40006e63c0), RayStartParams:map[string]string{}, Template:v1.PodTemplateSpec{ObjectMeta:v1.ObjectMeta{Name:"", GenerateName:"", Namespace:"", SelfLink:"", UID:"", ResourceVersion:"", Generation:0, CreationTimestamp:time.Date(1, time.January, 1, 0, 0, 0, 0, time.UTC), DeletionTimestamp:<nil>, DeletionGracePeriodSeconds:(*int64)(nil), Labels:map[string]string(nil), Annotations:map[string]string(nil), OwnerReferences:[]v1.OwnerReference(nil), Finalizers:[]string(nil), ClusterName:"", ManagedFields:[]v1.ManagedFieldsEntry(nil)}, Spec:v1.PodSpec{Volumes:[]v1.Volume(nil), InitContainers:[]v1.Container(nil), Containers:[]v1.Container{v1.Container{Name:"ray-worker", Image:"rayproject/ray:2.7.0", Command:[]string(nil), Args:[]string(nil), WorkingDir:"", Ports:[]v1.ContainerPort(nil), EnvFrom:[]v1.EnvFromSource(nil), Env:[]v1.EnvVar(nil), Resources:v1.ResourceRequirements{Limits:v1.ResourceList(nil), Requests:v1.ResourceList(nil)}, VolumeMounts:[]v1.VolumeMount(nil), VolumeDevices:[]v1.VolumeDevice(nil), LivenessProbe:(*v1.Probe)(nil), ReadinessProbe:(*v1.Probe)(nil), StartupProbe:(*v1.Probe)(nil), Lifecycle:(*v1.Lifecycle)(nil), TerminationMessagePath:"", TerminationMessagePolicy:"", ImagePullPolicy:"", SecurityContext:(*v1.SecurityContext)(nil), Stdin:false, StdinOnce:false, TTY:false}}, EphemeralContainers:[]v1.EphemeralContainer(nil), RestartPolicy:"", TerminationGracePeriodSeconds:(*int64)(nil), ActiveDeadlineSeconds:(*int64)(nil), DNSPolicy:"", NodeSelector:map[string]string(nil), ServiceAccountName:"", DeprecatedServiceAccount:"", AutomountServiceAccountToken:(*bool)(nil), NodeName:"", HostNetwork:false, HostPID:false, HostIPC:false, ShareProcessNamespace:(*bool)(nil), SecurityContext:(*v1.PodSecurityContext)(nil), ImagePullSecrets:[]v1.LocalObjectReference(nil), Hostname:"", Subdomain:"", Affinity:(*v1.Affinity)(nil), SchedulerName:"", Tolerations:[]v1.Toleration(nil), HostAliases:[]v1.HostAlias(nil), PriorityClassName:"", Priority:(*int32)(nil), DNSConfig:(*v1.PodDNSConfig)(nil), ReadinessGates:[]v1.PodReadinessGate(nil), RuntimeClassName:(*string)(nil), EnableServiceLinks:(*bool)(nil), PreemptionPolicy:(*v1.PreemptionPolicy)(nil), Overhead:v1.ResourceList(nil), TopologySpreadConstraints:[]v1.TopologySpreadConstraint(nil), SetHostnameAsFQDN:(*bool)(nil), OS:(*v1.PodOS)(nil)}}, ScaleStrategy:v1.ScaleStrategy{WorkersToDelete:[]string(nil)}}: worker group names must be unique` ## Backwards compatibility Just use original Makefile targets of `install`, `uninstall`, `deploy`, and `undeploy`. ```shell IMG=kuberay/operator:test make docker-build kind load docker-image kuberay/operator:test IMG=kuberay/operator:test make deploy kubectl apply -f dupe-worker-group-name.yaml raycluster.ray.io/raycluster-dupe-worker-name created ``` closes ray-project#718 closes ray-project#736 [1]: https://book.kubebuilder.io/cronjob-tutorial/webhook-implementation
davidxia
added a commit
to davidxia/kuberay
that referenced
this issue
Nov 20, 2023
and validate worker group names are unique. Add new Makefile targets `install-with-webhooks`, `uninstall-with-webhooks`, `deploy-with-webhooks`, and `undeploy-with-webhooks` to be backwards compatible and opt-in. Much of the code was generated by running the command below as [documented in kubebuilder][1]. ``` kubebuilder create webhook \ --group ray \ --version v1 \ --kind RayCluster \ --defaulting \ --programmatic-validation` ``` ## How to use locally ```shell IMG=kuberay/operator:test make docker-build kind load docker-image kuberay/operator:test IMG=kuberay/operator:test make deploy-with-webhooks ``` ## Example RayCluster that has duplicate worker group names ```shell cat dupe-worker-group-name.yaml apiVersion: ray.io/v1 kind: RayCluster metadata: name: dupe-worker-group-name spec: headGroupSpec: rayStartParams: dashboard-host: '0.0.0.0' template: spec: containers: - name: ray-head image: rayproject/ray:2.7.0 workerGroupSpecs: - replicas: 1 minReplicas: 1 maxReplicas: 10 groupName: group1 rayStartParams: {} template: spec: containers: - name: ray-worker image: rayproject/ray:2.7.0 - replicas: 1 minReplicas: 1 maxReplicas: 10 groupName: group1 rayStartParams: {} template: spec: containers: - name: ray-worker image: rayproject/ray:2.7.0 ``` ## Before ``` kubectl apply -f dupe-worker-group-name.yaml raycluster.ray.io/raycluster-dupe-worker-name created ``` ## After ``` kubectl --context kind-kind apply -f config/samples/ray-cluster-dupe-worker-name.yaml ``` `The RayCluster "raycluster-dupe-worker-name" is invalid: spec.workerGroupSpecs[1]: Invalid value: v1.WorkerGroupSpec{GroupName:"group1", Replicas:(*int32)(0x40006e63cc), MinReplicas:(*int32)(0x40006e63c8), MaxReplicas:(*int32)(0x40006e63c0), RayStartParams:map[string]string{}, Template:v1.PodTemplateSpec{ObjectMeta:v1.ObjectMeta{Name:"", GenerateName:"", Namespace:"", SelfLink:"", UID:"", ResourceVersion:"", Generation:0, CreationTimestamp:time.Date(1, time.January, 1, 0, 0, 0, 0, time.UTC), DeletionTimestamp:<nil>, DeletionGracePeriodSeconds:(*int64)(nil), Labels:map[string]string(nil), Annotations:map[string]string(nil), OwnerReferences:[]v1.OwnerReference(nil), Finalizers:[]string(nil), ClusterName:"", ManagedFields:[]v1.ManagedFieldsEntry(nil)}, Spec:v1.PodSpec{Volumes:[]v1.Volume(nil), InitContainers:[]v1.Container(nil), Containers:[]v1.Container{v1.Container{Name:"ray-worker", Image:"rayproject/ray:2.7.0", Command:[]string(nil), Args:[]string(nil), WorkingDir:"", Ports:[]v1.ContainerPort(nil), EnvFrom:[]v1.EnvFromSource(nil), Env:[]v1.EnvVar(nil), Resources:v1.ResourceRequirements{Limits:v1.ResourceList(nil), Requests:v1.ResourceList(nil)}, VolumeMounts:[]v1.VolumeMount(nil), VolumeDevices:[]v1.VolumeDevice(nil), LivenessProbe:(*v1.Probe)(nil), ReadinessProbe:(*v1.Probe)(nil), StartupProbe:(*v1.Probe)(nil), Lifecycle:(*v1.Lifecycle)(nil), TerminationMessagePath:"", TerminationMessagePolicy:"", ImagePullPolicy:"", SecurityContext:(*v1.SecurityContext)(nil), Stdin:false, StdinOnce:false, TTY:false}}, EphemeralContainers:[]v1.EphemeralContainer(nil), RestartPolicy:"", TerminationGracePeriodSeconds:(*int64)(nil), ActiveDeadlineSeconds:(*int64)(nil), DNSPolicy:"", NodeSelector:map[string]string(nil), ServiceAccountName:"", DeprecatedServiceAccount:"", AutomountServiceAccountToken:(*bool)(nil), NodeName:"", HostNetwork:false, HostPID:false, HostIPC:false, ShareProcessNamespace:(*bool)(nil), SecurityContext:(*v1.PodSecurityContext)(nil), ImagePullSecrets:[]v1.LocalObjectReference(nil), Hostname:"", Subdomain:"", Affinity:(*v1.Affinity)(nil), SchedulerName:"", Tolerations:[]v1.Toleration(nil), HostAliases:[]v1.HostAlias(nil), PriorityClassName:"", Priority:(*int32)(nil), DNSConfig:(*v1.PodDNSConfig)(nil), ReadinessGates:[]v1.PodReadinessGate(nil), RuntimeClassName:(*string)(nil), EnableServiceLinks:(*bool)(nil), PreemptionPolicy:(*v1.PreemptionPolicy)(nil), Overhead:v1.ResourceList(nil), TopologySpreadConstraints:[]v1.TopologySpreadConstraint(nil), SetHostnameAsFQDN:(*bool)(nil), OS:(*v1.PodOS)(nil)}}, ScaleStrategy:v1.ScaleStrategy{WorkersToDelete:[]string(nil)}}: worker group names must be unique` ## Backwards compatibility Just use original Makefile targets of `install`, `uninstall`, `deploy`, and `undeploy`. ```shell IMG=kuberay/operator:test make docker-build kind load docker-image kuberay/operator:test IMG=kuberay/operator:test make deploy kubectl apply -f dupe-worker-group-name.yaml raycluster.ray.io/raycluster-dupe-worker-name created ``` closes ray-project#718 closes ray-project#736 [1]: https://book.kubebuilder.io/cronjob-tutorial/webhook-implementation
davidxia
added a commit
to davidxia/kuberay
that referenced
this issue
Nov 23, 2023
and validate worker group names are unique. Add new Makefile targets `install-with-webhooks`, `uninstall-with-webhooks`, `deploy-with-webhooks`, and `undeploy-with-webhooks` to be backwards compatible and opt-in. Much of the code was generated by running the command below as [documented in kubebuilder][1]. ``` kubebuilder create webhook \ --group ray \ --version v1 \ --kind RayCluster \ --defaulting \ --programmatic-validation` ``` ## How to use locally ```shell IMG=kuberay/operator:test make docker-build kind load docker-image kuberay/operator:test IMG=kuberay/operator:test make deploy-with-webhooks ``` ## Example RayCluster that has duplicate worker group names ```shell cat dupe-worker-group-name.yaml apiVersion: ray.io/v1 kind: RayCluster metadata: name: dupe-worker-group-name spec: headGroupSpec: rayStartParams: dashboard-host: '0.0.0.0' template: spec: containers: - name: ray-head image: rayproject/ray:2.7.0 workerGroupSpecs: - replicas: 1 minReplicas: 1 maxReplicas: 10 groupName: group1 rayStartParams: {} template: spec: containers: - name: ray-worker image: rayproject/ray:2.7.0 - replicas: 1 minReplicas: 1 maxReplicas: 10 groupName: group1 rayStartParams: {} template: spec: containers: - name: ray-worker image: rayproject/ray:2.7.0 ``` ## Before ``` kubectl apply -f dupe-worker-group-name.yaml raycluster.ray.io/raycluster-dupe-worker-name created ``` ## After ``` kubectl --context kind-kind apply -f config/samples/ray-cluster-dupe-worker-name.yaml ``` `The RayCluster "raycluster-dupe-worker-name" is invalid: spec.workerGroupSpecs[1]: Invalid value: v1.WorkerGroupSpec{GroupName:"group1", Replicas:(*int32)(0x40006e63cc), MinReplicas:(*int32)(0x40006e63c8), MaxReplicas:(*int32)(0x40006e63c0), RayStartParams:map[string]string{}, Template:v1.PodTemplateSpec{ObjectMeta:v1.ObjectMeta{Name:"", GenerateName:"", Namespace:"", SelfLink:"", UID:"", ResourceVersion:"", Generation:0, CreationTimestamp:time.Date(1, time.January, 1, 0, 0, 0, 0, time.UTC), DeletionTimestamp:<nil>, DeletionGracePeriodSeconds:(*int64)(nil), Labels:map[string]string(nil), Annotations:map[string]string(nil), OwnerReferences:[]v1.OwnerReference(nil), Finalizers:[]string(nil), ClusterName:"", ManagedFields:[]v1.ManagedFieldsEntry(nil)}, Spec:v1.PodSpec{Volumes:[]v1.Volume(nil), InitContainers:[]v1.Container(nil), Containers:[]v1.Container{v1.Container{Name:"ray-worker", Image:"rayproject/ray:2.7.0", Command:[]string(nil), Args:[]string(nil), WorkingDir:"", Ports:[]v1.ContainerPort(nil), EnvFrom:[]v1.EnvFromSource(nil), Env:[]v1.EnvVar(nil), Resources:v1.ResourceRequirements{Limits:v1.ResourceList(nil), Requests:v1.ResourceList(nil)}, VolumeMounts:[]v1.VolumeMount(nil), VolumeDevices:[]v1.VolumeDevice(nil), LivenessProbe:(*v1.Probe)(nil), ReadinessProbe:(*v1.Probe)(nil), StartupProbe:(*v1.Probe)(nil), Lifecycle:(*v1.Lifecycle)(nil), TerminationMessagePath:"", TerminationMessagePolicy:"", ImagePullPolicy:"", SecurityContext:(*v1.SecurityContext)(nil), Stdin:false, StdinOnce:false, TTY:false}}, EphemeralContainers:[]v1.EphemeralContainer(nil), RestartPolicy:"", TerminationGracePeriodSeconds:(*int64)(nil), ActiveDeadlineSeconds:(*int64)(nil), DNSPolicy:"", NodeSelector:map[string]string(nil), ServiceAccountName:"", DeprecatedServiceAccount:"", AutomountServiceAccountToken:(*bool)(nil), NodeName:"", HostNetwork:false, HostPID:false, HostIPC:false, ShareProcessNamespace:(*bool)(nil), SecurityContext:(*v1.PodSecurityContext)(nil), ImagePullSecrets:[]v1.LocalObjectReference(nil), Hostname:"", Subdomain:"", Affinity:(*v1.Affinity)(nil), SchedulerName:"", Tolerations:[]v1.Toleration(nil), HostAliases:[]v1.HostAlias(nil), PriorityClassName:"", Priority:(*int32)(nil), DNSConfig:(*v1.PodDNSConfig)(nil), ReadinessGates:[]v1.PodReadinessGate(nil), RuntimeClassName:(*string)(nil), EnableServiceLinks:(*bool)(nil), PreemptionPolicy:(*v1.PreemptionPolicy)(nil), Overhead:v1.ResourceList(nil), TopologySpreadConstraints:[]v1.TopologySpreadConstraint(nil), SetHostnameAsFQDN:(*bool)(nil), OS:(*v1.PodOS)(nil)}}, ScaleStrategy:v1.ScaleStrategy{WorkersToDelete:[]string(nil)}}: worker group names must be unique` ## Backwards compatibility Just use original Makefile targets of `install`, `uninstall`, `deploy`, and `undeploy`. ```shell IMG=kuberay/operator:test make docker-build kind load docker-image kuberay/operator:test IMG=kuberay/operator:test make deploy kubectl apply -f dupe-worker-group-name.yaml raycluster.ray.io/raycluster-dupe-worker-name created ``` closes ray-project#718 closes ray-project#736 [1]: https://book.kubebuilder.io/cronjob-tutorial/webhook-implementation
davidxia
added a commit
to davidxia/kuberay
that referenced
this issue
Nov 23, 2023
and validate worker group names are unique. Add new Makefile targets `install-with-webhooks`, `uninstall-with-webhooks`, `deploy-with-webhooks`, and `undeploy-with-webhooks` to be backwards compatible and opt-in. Much of the code was generated by running the command below as [documented in kubebuilder][1]. ``` kubebuilder create webhook \ --group ray \ --version v1 \ --kind RayCluster \ --defaulting \ --programmatic-validation` ``` ## How to use locally ```shell IMG=kuberay/operator:test make docker-build kind load docker-image kuberay/operator:test IMG=kuberay/operator:test make deploy-with-webhooks ``` ## Example RayCluster that has duplicate worker group names ```shell cat dupe-worker-group-name.yaml apiVersion: ray.io/v1 kind: RayCluster metadata: name: dupe-worker-group-name spec: headGroupSpec: rayStartParams: dashboard-host: '0.0.0.0' template: spec: containers: - name: ray-head image: rayproject/ray:2.7.0 workerGroupSpecs: - replicas: 1 minReplicas: 1 maxReplicas: 10 groupName: group1 rayStartParams: {} template: spec: containers: - name: ray-worker image: rayproject/ray:2.7.0 - replicas: 1 minReplicas: 1 maxReplicas: 10 groupName: group1 rayStartParams: {} template: spec: containers: - name: ray-worker image: rayproject/ray:2.7.0 ``` ## Before ``` kubectl apply -f dupe-worker-group-name.yaml raycluster.ray.io/raycluster-dupe-worker-name created ``` ## After ``` kubectl --context kind-kind apply -f config/samples/ray-cluster-dupe-worker-name.yaml ``` `The RayCluster "raycluster-dupe-worker-name" is invalid: spec.workerGroupSpecs[1]: Invalid value: v1.WorkerGroupSpec{GroupName:"group1", Replicas:(*int32)(0x40006e63cc), MinReplicas:(*int32)(0x40006e63c8), MaxReplicas:(*int32)(0x40006e63c0), RayStartParams:map[string]string{}, Template:v1.PodTemplateSpec{ObjectMeta:v1.ObjectMeta{Name:"", GenerateName:"", Namespace:"", SelfLink:"", UID:"", ResourceVersion:"", Generation:0, CreationTimestamp:time.Date(1, time.January, 1, 0, 0, 0, 0, time.UTC), DeletionTimestamp:<nil>, DeletionGracePeriodSeconds:(*int64)(nil), Labels:map[string]string(nil), Annotations:map[string]string(nil), OwnerReferences:[]v1.OwnerReference(nil), Finalizers:[]string(nil), ClusterName:"", ManagedFields:[]v1.ManagedFieldsEntry(nil)}, Spec:v1.PodSpec{Volumes:[]v1.Volume(nil), InitContainers:[]v1.Container(nil), Containers:[]v1.Container{v1.Container{Name:"ray-worker", Image:"rayproject/ray:2.7.0", Command:[]string(nil), Args:[]string(nil), WorkingDir:"", Ports:[]v1.ContainerPort(nil), EnvFrom:[]v1.EnvFromSource(nil), Env:[]v1.EnvVar(nil), Resources:v1.ResourceRequirements{Limits:v1.ResourceList(nil), Requests:v1.ResourceList(nil)}, VolumeMounts:[]v1.VolumeMount(nil), VolumeDevices:[]v1.VolumeDevice(nil), LivenessProbe:(*v1.Probe)(nil), ReadinessProbe:(*v1.Probe)(nil), StartupProbe:(*v1.Probe)(nil), Lifecycle:(*v1.Lifecycle)(nil), TerminationMessagePath:"", TerminationMessagePolicy:"", ImagePullPolicy:"", SecurityContext:(*v1.SecurityContext)(nil), Stdin:false, StdinOnce:false, TTY:false}}, EphemeralContainers:[]v1.EphemeralContainer(nil), RestartPolicy:"", TerminationGracePeriodSeconds:(*int64)(nil), ActiveDeadlineSeconds:(*int64)(nil), DNSPolicy:"", NodeSelector:map[string]string(nil), ServiceAccountName:"", DeprecatedServiceAccount:"", AutomountServiceAccountToken:(*bool)(nil), NodeName:"", HostNetwork:false, HostPID:false, HostIPC:false, ShareProcessNamespace:(*bool)(nil), SecurityContext:(*v1.PodSecurityContext)(nil), ImagePullSecrets:[]v1.LocalObjectReference(nil), Hostname:"", Subdomain:"", Affinity:(*v1.Affinity)(nil), SchedulerName:"", Tolerations:[]v1.Toleration(nil), HostAliases:[]v1.HostAlias(nil), PriorityClassName:"", Priority:(*int32)(nil), DNSConfig:(*v1.PodDNSConfig)(nil), ReadinessGates:[]v1.PodReadinessGate(nil), RuntimeClassName:(*string)(nil), EnableServiceLinks:(*bool)(nil), PreemptionPolicy:(*v1.PreemptionPolicy)(nil), Overhead:v1.ResourceList(nil), TopologySpreadConstraints:[]v1.TopologySpreadConstraint(nil), SetHostnameAsFQDN:(*bool)(nil), OS:(*v1.PodOS)(nil)}}, ScaleStrategy:v1.ScaleStrategy{WorkersToDelete:[]string(nil)}}: worker group names must be unique` ## Backwards compatibility Just use original Makefile targets of `install`, `uninstall`, `deploy`, and `undeploy`. ```shell IMG=kuberay/operator:test make docker-build kind load docker-image kuberay/operator:test IMG=kuberay/operator:test make deploy kubectl apply -f dupe-worker-group-name.yaml raycluster.ray.io/raycluster-dupe-worker-name created ``` closes ray-project#718 closes ray-project#736 [1]: https://book.kubebuilder.io/cronjob-tutorial/webhook-implementation
davidxia
added a commit
to davidxia/kuberay
that referenced
this issue
Nov 23, 2023
and validate worker group names are unique. Add new Makefile targets `install-with-webhooks`, `uninstall-with-webhooks`, `deploy-with-webhooks`, and `undeploy-with-webhooks` to be backwards compatible and opt-in. Much of the code was generated by running the command below as [documented in kubebuilder][1]. ``` kubebuilder create webhook \ --group ray \ --version v1 \ --kind RayCluster \ --defaulting \ --programmatic-validation` ``` ## How to use locally ```shell IMG=kuberay/operator:test make docker-build kind load docker-image kuberay/operator:test IMG=kuberay/operator:test make deploy-with-webhooks ``` ## Example RayCluster that has duplicate worker group names ```shell cat dupe-worker-group-name.yaml apiVersion: ray.io/v1 kind: RayCluster metadata: name: dupe-worker-group-name spec: headGroupSpec: rayStartParams: dashboard-host: '0.0.0.0' template: spec: containers: - name: ray-head image: rayproject/ray:2.7.0 workerGroupSpecs: - replicas: 1 minReplicas: 1 maxReplicas: 10 groupName: group1 rayStartParams: {} template: spec: containers: - name: ray-worker image: rayproject/ray:2.7.0 - replicas: 1 minReplicas: 1 maxReplicas: 10 groupName: group1 rayStartParams: {} template: spec: containers: - name: ray-worker image: rayproject/ray:2.7.0 ``` ## Before ``` kubectl apply -f dupe-worker-group-name.yaml raycluster.ray.io/raycluster-dupe-worker-name created ``` ## After ``` kubectl --context kind-kind apply -f config/samples/ray-cluster-dupe-worker-name.yaml ``` `The RayCluster "raycluster-dupe-worker-name" is invalid: spec.workerGroupSpecs[1]: Invalid value: v1.WorkerGroupSpec{GroupName:"group1", Replicas:(*int32)(0x40006e63cc), MinReplicas:(*int32)(0x40006e63c8), MaxReplicas:(*int32)(0x40006e63c0), RayStartParams:map[string]string{}, Template:v1.PodTemplateSpec{ObjectMeta:v1.ObjectMeta{Name:"", GenerateName:"", Namespace:"", SelfLink:"", UID:"", ResourceVersion:"", Generation:0, CreationTimestamp:time.Date(1, time.January, 1, 0, 0, 0, 0, time.UTC), DeletionTimestamp:<nil>, DeletionGracePeriodSeconds:(*int64)(nil), Labels:map[string]string(nil), Annotations:map[string]string(nil), OwnerReferences:[]v1.OwnerReference(nil), Finalizers:[]string(nil), ClusterName:"", ManagedFields:[]v1.ManagedFieldsEntry(nil)}, Spec:v1.PodSpec{Volumes:[]v1.Volume(nil), InitContainers:[]v1.Container(nil), Containers:[]v1.Container{v1.Container{Name:"ray-worker", Image:"rayproject/ray:2.7.0", Command:[]string(nil), Args:[]string(nil), WorkingDir:"", Ports:[]v1.ContainerPort(nil), EnvFrom:[]v1.EnvFromSource(nil), Env:[]v1.EnvVar(nil), Resources:v1.ResourceRequirements{Limits:v1.ResourceList(nil), Requests:v1.ResourceList(nil)}, VolumeMounts:[]v1.VolumeMount(nil), VolumeDevices:[]v1.VolumeDevice(nil), LivenessProbe:(*v1.Probe)(nil), ReadinessProbe:(*v1.Probe)(nil), StartupProbe:(*v1.Probe)(nil), Lifecycle:(*v1.Lifecycle)(nil), TerminationMessagePath:"", TerminationMessagePolicy:"", ImagePullPolicy:"", SecurityContext:(*v1.SecurityContext)(nil), Stdin:false, StdinOnce:false, TTY:false}}, EphemeralContainers:[]v1.EphemeralContainer(nil), RestartPolicy:"", TerminationGracePeriodSeconds:(*int64)(nil), ActiveDeadlineSeconds:(*int64)(nil), DNSPolicy:"", NodeSelector:map[string]string(nil), ServiceAccountName:"", DeprecatedServiceAccount:"", AutomountServiceAccountToken:(*bool)(nil), NodeName:"", HostNetwork:false, HostPID:false, HostIPC:false, ShareProcessNamespace:(*bool)(nil), SecurityContext:(*v1.PodSecurityContext)(nil), ImagePullSecrets:[]v1.LocalObjectReference(nil), Hostname:"", Subdomain:"", Affinity:(*v1.Affinity)(nil), SchedulerName:"", Tolerations:[]v1.Toleration(nil), HostAliases:[]v1.HostAlias(nil), PriorityClassName:"", Priority:(*int32)(nil), DNSConfig:(*v1.PodDNSConfig)(nil), ReadinessGates:[]v1.PodReadinessGate(nil), RuntimeClassName:(*string)(nil), EnableServiceLinks:(*bool)(nil), PreemptionPolicy:(*v1.PreemptionPolicy)(nil), Overhead:v1.ResourceList(nil), TopologySpreadConstraints:[]v1.TopologySpreadConstraint(nil), SetHostnameAsFQDN:(*bool)(nil), OS:(*v1.PodOS)(nil)}}, ScaleStrategy:v1.ScaleStrategy{WorkersToDelete:[]string(nil)}}: worker group names must be unique` ## Backwards compatibility Just use original Makefile targets of `install`, `uninstall`, `deploy`, and `undeploy`. ```shell IMG=kuberay/operator:test make docker-build kind load docker-image kuberay/operator:test IMG=kuberay/operator:test make deploy kubectl apply -f dupe-worker-group-name.yaml raycluster.ray.io/raycluster-dupe-worker-name created ``` closes ray-project#718 closes ray-project#736 [1]: https://book.kubebuilder.io/cronjob-tutorial/webhook-implementation
davidxia
added a commit
to davidxia/kuberay
that referenced
this issue
Nov 23, 2023
and validate worker group names are unique. Add new Makefile targets `install-with-webhooks`, `uninstall-with-webhooks`, `deploy-with-webhooks`, and `undeploy-with-webhooks` to be backwards compatible and opt-in. Much of the code, especially the YAML files, was generated by running the command below as [documented in kubebuilder][1]. ``` kubebuilder create webhook \ --group ray \ --version v1 \ --kind RayCluster \ --defaulting \ --programmatic-validation ``` ## How to use locally ```shell make manifests generate make install-with-webhooks IMG=kuberay/operator:test make docker-build kind load docker-image kuberay/operator:test IMG=kuberay/operator:test make deploy-with-webhooks ``` ## Example RayCluster that has duplicate worker group names ```shell cat dupe-worker-group-name.yaml apiVersion: ray.io/v1 kind: RayCluster metadata: name: dupe-worker-group-name spec: headGroupSpec: rayStartParams: dashboard-host: '0.0.0.0' template: spec: containers: - name: ray-head image: rayproject/ray:2.7.0 workerGroupSpecs: - replicas: 1 minReplicas: 1 maxReplicas: 10 groupName: group1 rayStartParams: {} template: spec: containers: - name: ray-worker image: rayproject/ray:2.7.0 - replicas: 1 minReplicas: 1 maxReplicas: 10 groupName: group1 rayStartParams: {} template: spec: containers: - name: ray-worker image: rayproject/ray:2.7.0 ``` ## Before ``` kubectl apply -f dupe-worker-group-name.yaml raycluster.ray.io/raycluster-dupe-worker-name created ``` ## After ``` kubectl --context kind-kind apply -f config/samples/ray-cluster-dupe-worker-name.yaml ``` `The RayCluster "raycluster-dupe-worker-name" is invalid: spec.workerGroupSpecs[1]: Invalid value: v1.WorkerGroupSpec{GroupName:"group1", Replicas:(*int32)(0x40006e63cc), MinReplicas:(*int32)(0x40006e63c8), MaxReplicas:(*int32)(0x40006e63c0), RayStartParams:map[string]string{}, Template:v1.PodTemplateSpec{ObjectMeta:v1.ObjectMeta{Name:"", GenerateName:"", Namespace:"", SelfLink:"", UID:"", ResourceVersion:"", Generation:0, CreationTimestamp:time.Date(1, time.January, 1, 0, 0, 0, 0, time.UTC), DeletionTimestamp:<nil>, DeletionGracePeriodSeconds:(*int64)(nil), Labels:map[string]string(nil), Annotations:map[string]string(nil), OwnerReferences:[]v1.OwnerReference(nil), Finalizers:[]string(nil), ClusterName:"", ManagedFields:[]v1.ManagedFieldsEntry(nil)}, Spec:v1.PodSpec{Volumes:[]v1.Volume(nil), InitContainers:[]v1.Container(nil), Containers:[]v1.Container{v1.Container{Name:"ray-worker", Image:"rayproject/ray:2.7.0", Command:[]string(nil), Args:[]string(nil), WorkingDir:"", Ports:[]v1.ContainerPort(nil), EnvFrom:[]v1.EnvFromSource(nil), Env:[]v1.EnvVar(nil), Resources:v1.ResourceRequirements{Limits:v1.ResourceList(nil), Requests:v1.ResourceList(nil)}, VolumeMounts:[]v1.VolumeMount(nil), VolumeDevices:[]v1.VolumeDevice(nil), LivenessProbe:(*v1.Probe)(nil), ReadinessProbe:(*v1.Probe)(nil), StartupProbe:(*v1.Probe)(nil), Lifecycle:(*v1.Lifecycle)(nil), TerminationMessagePath:"", TerminationMessagePolicy:"", ImagePullPolicy:"", SecurityContext:(*v1.SecurityContext)(nil), Stdin:false, StdinOnce:false, TTY:false}}, EphemeralContainers:[]v1.EphemeralContainer(nil), RestartPolicy:"", TerminationGracePeriodSeconds:(*int64)(nil), ActiveDeadlineSeconds:(*int64)(nil), DNSPolicy:"", NodeSelector:map[string]string(nil), ServiceAccountName:"", DeprecatedServiceAccount:"", AutomountServiceAccountToken:(*bool)(nil), NodeName:"", HostNetwork:false, HostPID:false, HostIPC:false, ShareProcessNamespace:(*bool)(nil), SecurityContext:(*v1.PodSecurityContext)(nil), ImagePullSecrets:[]v1.LocalObjectReference(nil), Hostname:"", Subdomain:"", Affinity:(*v1.Affinity)(nil), SchedulerName:"", Tolerations:[]v1.Toleration(nil), HostAliases:[]v1.HostAlias(nil), PriorityClassName:"", Priority:(*int32)(nil), DNSConfig:(*v1.PodDNSConfig)(nil), ReadinessGates:[]v1.PodReadinessGate(nil), RuntimeClassName:(*string)(nil), EnableServiceLinks:(*bool)(nil), PreemptionPolicy:(*v1.PreemptionPolicy)(nil), Overhead:v1.ResourceList(nil), TopologySpreadConstraints:[]v1.TopologySpreadConstraint(nil), SetHostnameAsFQDN:(*bool)(nil), OS:(*v1.PodOS)(nil)}}, ScaleStrategy:v1.ScaleStrategy{WorkersToDelete:[]string(nil)}}: worker group names must be unique` ## Backwards compatibility Just use original Makefile targets of `install`, `uninstall`, `deploy`, and `undeploy`. ```shell IMG=kuberay/operator:test make docker-build kind load docker-image kuberay/operator:test IMG=kuberay/operator:test make deploy kubectl apply -f dupe-worker-group-name.yaml raycluster.ray.io/raycluster-dupe-worker-name created ``` closes ray-project#718 closes ray-project#736 [1]: https://book.kubebuilder.io/cronjob-tutorial/webhook-implementation
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
Search before asking
Description
There are many aspects of KubeRay config validation that cannot be handled by CRD schema validation.
We have some limited validation functionality within the operator -- if the operator detects certain misconfigurations, it will refuse to reconcile and will instead write and some sort of error state to the status field.
Code reference.
However, it would be a better user experience if the CR simply failed validation up submission.
The standard way to do this would be to take advantage of KubeBuilder's support for validating webhooks.
This would make it easy to resolve UX issues like the one described here:
#718
Use case
Less painful configuration experience.
Related issues
#718
Are you willing to submit a PR?
The text was updated successfully, but these errors were encountered: