Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] KubeRay does not show clear error for duplicated groupName field #718

Closed
1 of 2 tasks
spolcyn opened this issue Nov 14, 2022 · 1 comment · Fixed by #1584
Closed
1 of 2 tasks

[Bug] KubeRay does not show clear error for duplicated groupName field #718

spolcyn opened this issue Nov 14, 2022 · 1 comment · Fixed by #1584
Labels
bug Something isn't working observability P1 Issue that should be fixed within a few weeks

Comments

@spolcyn
Copy link

spolcyn commented Nov 14, 2022

Search before asking

  • I searched the issues and found no similar issues.

KubeRay Component

ray-operator

What happened + What you expected to happen

  1. Saw: When a cluster YML specifies 2 worker pod types with the same groupName, the cluster goes into a loop of creating and immediately terminating all pods with that groupName
  2. Expected: The autoscaler or kuberay-operator fails with a clear error that "Multiple worker pod types have the same groupName"

Running Kuberay 0.3.0 and nightly, Ray v2.0.1

Reproduction script

Use any cluster YML with two worker pod types, both with the same groupName

Anything else

No response

Are you willing to submit a PR?

  • Yes I am willing to submit a PR!
@spolcyn spolcyn added the bug Something isn't working label Nov 14, 2022
@DmitriGekhtman DmitriGekhtman added the P1 Issue that should be fixed within a few weeks label Nov 15, 2022
@DmitriGekhtman
Copy link
Collaborator

Thanks for posting this. There's a general theme here that we need to raise better errors and improve observability of the system.

davidxia added a commit to davidxia/kuberay that referenced this issue Oct 30, 2023
davidxia added a commit to davidxia/kuberay that referenced this issue Oct 30, 2023
and validate worker group names are unique.

closes ray-project#718
closes ray-project#736
davidxia added a commit to davidxia/kuberay that referenced this issue Oct 30, 2023
and validate worker group names are unique.

closes ray-project#718
closes ray-project#736
davidxia added a commit to davidxia/kuberay that referenced this issue Oct 30, 2023
and validate worker group names are unique.

closes ray-project#718
closes ray-project#736
davidxia added a commit to davidxia/kuberay that referenced this issue Oct 30, 2023
and validate worker group names are unique.

closes ray-project#718
closes ray-project#736
davidxia added a commit to davidxia/kuberay that referenced this issue Oct 30, 2023
and validate worker group names are unique.

closes ray-project#718
closes ray-project#736
@davidxia davidxia mentioned this issue Oct 30, 2023
4 tasks
davidxia added a commit to davidxia/kuberay that referenced this issue Oct 30, 2023
and validate worker group names are unique.

Much of the code was generated by running the command below as [documented in
kubebuilder][1].

```
kubebuilder create webhook \
  --group ray \
  --version v1 \
  --kind RayCluster \
  --defaulting \
  --programmatic-validation`
```

## Example RayCluster that has duplicate worker group names

```shell
cat dupe-worker-group-name.yaml

apiVersion: ray.io/v1
kind: RayCluster
metadata:
  name: dupe-worker-group-name
spec:
  headGroupSpec:
    rayStartParams:
      dashboard-host: '0.0.0.0'
    template:
      spec:
        containers:
        - name: ray-head
          image: rayproject/ray:2.7.0
  workerGroupSpecs:
  - replicas: 1
    minReplicas: 1
    maxReplicas: 10
    groupName: group1
    rayStartParams: {}
    template:
      spec:
        containers:
        - name: ray-worker
          image: rayproject/ray:2.7.0
  - replicas: 1
    minReplicas: 1
    maxReplicas: 10
    groupName: group1
    rayStartParams: {}
    template:
      spec:
        containers:
        - name: ray-worker
          image: rayproject/ray:2.7.0
```

## Before

```
kubectl apply -f dupe-worker-group-name.yaml
raycluster.ray.io/raycluster-dupe-worker-name created
```

## After

```
kubectl --context kind-kind apply -f config/samples/ray-cluster-dupe-worker-name.yaml                                                                                            (base)
The RayCluster "raycluster-dupe-worker-name" is invalid: spec.workerGroupSpecs[1]: Invalid value: v1.WorkerGroupSpec{GroupName:"group1", Replicas:(*int32)(0x40006e63cc), MinReplicas:(*int32)(0x40006e63c8), MaxReplicas:(*int32)(0x40006e63c0), RayStartParams:map[string]string{}, Template:v1.PodTemplateSpec{ObjectMeta:v1.ObjectMeta{Name:"", GenerateName:"", Namespace:"", SelfLink:"", UID:"", ResourceVersion:"", Generation:0, CreationTimestamp:time.Date(1, time.January, 1, 0, 0, 0, 0, time.UTC), DeletionTimestamp:<nil>, DeletionGracePeriodSeconds:(*int64)(nil), Labels:map[string]string(nil), Annotations:map[string]string(nil), OwnerReferences:[]v1.OwnerReference(nil), Finalizers:[]string(nil), ClusterName:"", ManagedFields:[]v1.ManagedFieldsEntry(nil)}, Spec:v1.PodSpec{Volumes:[]v1.Volume(nil), InitContainers:[]v1.Container(nil), Containers:[]v1.Container{v1.Container{Name:"ray-worker", Image:"rayproject/ray:2.7.0", Command:[]string(nil), Args:[]string(nil), WorkingDir:"", Ports:[]v1.ContainerPort(nil), EnvFrom:[]v1.EnvFromSource(nil), Env:[]v1.EnvVar(nil), Resources:v1.ResourceRequirements{Limits:v1.ResourceList(nil), Requests:v1.ResourceList(nil)}, VolumeMounts:[]v1.VolumeMount(nil), VolumeDevices:[]v1.VolumeDevice(nil), LivenessProbe:(*v1.Probe)(nil), ReadinessProbe:(*v1.Probe)(nil), StartupProbe:(*v1.Probe)(nil), Lifecycle:(*v1.Lifecycle)(nil), TerminationMessagePath:"", TerminationMessagePolicy:"", ImagePullPolicy:"", SecurityContext:(*v1.SecurityContext)(nil), Stdin:false, StdinOnce:false, TTY:false}}, EphemeralContainers:[]v1.EphemeralContainer(nil), RestartPolicy:"", TerminationGracePeriodSeconds:(*int64)(nil), ActiveDeadlineSeconds:(*int64)(nil), DNSPolicy:"", NodeSelector:map[string]string(nil), ServiceAccountName:"", DeprecatedServiceAccount:"", AutomountServiceAccountToken:(*bool)(nil), NodeName:"", HostNetwork:false, HostPID:false, HostIPC:false, ShareProcessNamespace:(*bool)(nil), SecurityContext:(*v1.PodSecurityContext)(nil), ImagePullSecrets:[]v1.LocalObjectReference(nil), Hostname:"", Subdomain:"", Affinity:(*v1.Affinity)(nil), SchedulerName:"", Tolerations:[]v1.Toleration(nil), HostAliases:[]v1.HostAlias(nil), PriorityClassName:"", Priority:(*int32)(nil), DNSConfig:(*v1.PodDNSConfig)(nil), ReadinessGates:[]v1.PodReadinessGate(nil), RuntimeClassName:(*string)(nil), EnableServiceLinks:(*bool)(nil), PreemptionPolicy:(*v1.PreemptionPolicy)(nil), Overhead:v1.ResourceList(nil), TopologySpreadConstraints:[]v1.TopologySpreadConstraint(nil), SetHostnameAsFQDN:(*bool)(nil), OS:(*v1.PodOS)(nil)}}, ScaleStrategy:v1.ScaleStrategy{WorkersToDelete:[]string(nil)}}: worker group names must be unique
```

closes ray-project#718
closes ray-project#736

[1]: https://book.kubebuilder.io/cronjob-tutorial/webhook-implementation
davidxia added a commit to davidxia/kuberay that referenced this issue Oct 30, 2023
and validate worker group names are unique.

Much of the code was generated by running the command below as [documented in
kubebuilder][1].

```
kubebuilder create webhook \
  --group ray \
  --version v1 \
  --kind RayCluster \
  --defaulting \
  --programmatic-validation`
```

## Example RayCluster that has duplicate worker group names

```shell
cat dupe-worker-group-name.yaml

apiVersion: ray.io/v1
kind: RayCluster
metadata:
  name: dupe-worker-group-name
spec:
  headGroupSpec:
    rayStartParams:
      dashboard-host: '0.0.0.0'
    template:
      spec:
        containers:
        - name: ray-head
          image: rayproject/ray:2.7.0
  workerGroupSpecs:
  - replicas: 1
    minReplicas: 1
    maxReplicas: 10
    groupName: group1
    rayStartParams: {}
    template:
      spec:
        containers:
        - name: ray-worker
          image: rayproject/ray:2.7.0
  - replicas: 1
    minReplicas: 1
    maxReplicas: 10
    groupName: group1
    rayStartParams: {}
    template:
      spec:
        containers:
        - name: ray-worker
          image: rayproject/ray:2.7.0
```

## Before

```
kubectl apply -f dupe-worker-group-name.yaml
raycluster.ray.io/raycluster-dupe-worker-name created
```

## After

```
kubectl --context kind-kind apply -f config/samples/ray-cluster-dupe-worker-name.yaml                                                                                            (base)
The RayCluster "raycluster-dupe-worker-name" is invalid: spec.workerGroupSpecs[1]: Invalid value: v1.WorkerGroupSpec{GroupName:"group1", Replicas:(*int32)(0x40006e63cc), MinReplicas:(*int32)(0x40006e63c8), MaxReplicas:(*int32)(0x40006e63c0), RayStartParams:map[string]string{}, Template:v1.PodTemplateSpec{ObjectMeta:v1.ObjectMeta{Name:"", GenerateName:"", Namespace:"", SelfLink:"", UID:"", ResourceVersion:"", Generation:0, CreationTimestamp:time.Date(1, time.January, 1, 0, 0, 0, 0, time.UTC), DeletionTimestamp:<nil>, DeletionGracePeriodSeconds:(*int64)(nil), Labels:map[string]string(nil), Annotations:map[string]string(nil), OwnerReferences:[]v1.OwnerReference(nil), Finalizers:[]string(nil), ClusterName:"", ManagedFields:[]v1.ManagedFieldsEntry(nil)}, Spec:v1.PodSpec{Volumes:[]v1.Volume(nil), InitContainers:[]v1.Container(nil), Containers:[]v1.Container{v1.Container{Name:"ray-worker", Image:"rayproject/ray:2.7.0", Command:[]string(nil), Args:[]string(nil), WorkingDir:"", Ports:[]v1.ContainerPort(nil), EnvFrom:[]v1.EnvFromSource(nil), Env:[]v1.EnvVar(nil), Resources:v1.ResourceRequirements{Limits:v1.ResourceList(nil), Requests:v1.ResourceList(nil)}, VolumeMounts:[]v1.VolumeMount(nil), VolumeDevices:[]v1.VolumeDevice(nil), LivenessProbe:(*v1.Probe)(nil), ReadinessProbe:(*v1.Probe)(nil), StartupProbe:(*v1.Probe)(nil), Lifecycle:(*v1.Lifecycle)(nil), TerminationMessagePath:"", TerminationMessagePolicy:"", ImagePullPolicy:"", SecurityContext:(*v1.SecurityContext)(nil), Stdin:false, StdinOnce:false, TTY:false}}, EphemeralContainers:[]v1.EphemeralContainer(nil), RestartPolicy:"", TerminationGracePeriodSeconds:(*int64)(nil), ActiveDeadlineSeconds:(*int64)(nil), DNSPolicy:"", NodeSelector:map[string]string(nil), ServiceAccountName:"", DeprecatedServiceAccount:"", AutomountServiceAccountToken:(*bool)(nil), NodeName:"", HostNetwork:false, HostPID:false, HostIPC:false, ShareProcessNamespace:(*bool)(nil), SecurityContext:(*v1.PodSecurityContext)(nil), ImagePullSecrets:[]v1.LocalObjectReference(nil), Hostname:"", Subdomain:"", Affinity:(*v1.Affinity)(nil), SchedulerName:"", Tolerations:[]v1.Toleration(nil), HostAliases:[]v1.HostAlias(nil), PriorityClassName:"", Priority:(*int32)(nil), DNSConfig:(*v1.PodDNSConfig)(nil), ReadinessGates:[]v1.PodReadinessGate(nil), RuntimeClassName:(*string)(nil), EnableServiceLinks:(*bool)(nil), PreemptionPolicy:(*v1.PreemptionPolicy)(nil), Overhead:v1.ResourceList(nil), TopologySpreadConstraints:[]v1.TopologySpreadConstraint(nil), SetHostnameAsFQDN:(*bool)(nil), OS:(*v1.PodOS)(nil)}}, ScaleStrategy:v1.ScaleStrategy{WorkersToDelete:[]string(nil)}}: worker group names must be unique
```

closes ray-project#718
closes ray-project#736

[1]: https://book.kubebuilder.io/cronjob-tutorial/webhook-implementation
davidxia added a commit to davidxia/kuberay that referenced this issue Oct 31, 2023
and validate worker group names are unique.

Much of the code was generated by running the command below as [documented in
kubebuilder][1].

```
kubebuilder create webhook \
  --group ray \
  --version v1 \
  --kind RayCluster \
  --defaulting \
  --programmatic-validation`
```

## Example RayCluster that has duplicate worker group names

```shell
cat dupe-worker-group-name.yaml

apiVersion: ray.io/v1
kind: RayCluster
metadata:
  name: dupe-worker-group-name
spec:
  headGroupSpec:
    rayStartParams:
      dashboard-host: '0.0.0.0'
    template:
      spec:
        containers:
        - name: ray-head
          image: rayproject/ray:2.7.0
  workerGroupSpecs:
  - replicas: 1
    minReplicas: 1
    maxReplicas: 10
    groupName: group1
    rayStartParams: {}
    template:
      spec:
        containers:
        - name: ray-worker
          image: rayproject/ray:2.7.0
  - replicas: 1
    minReplicas: 1
    maxReplicas: 10
    groupName: group1
    rayStartParams: {}
    template:
      spec:
        containers:
        - name: ray-worker
          image: rayproject/ray:2.7.0
```

## Before

```
kubectl apply -f dupe-worker-group-name.yaml
raycluster.ray.io/raycluster-dupe-worker-name created
```

## After

```
kubectl --context kind-kind apply -f config/samples/ray-cluster-dupe-worker-name.yaml                                                                                            (base)
```

`The RayCluster "raycluster-dupe-worker-name" is invalid:
spec.workerGroupSpecs[1]: Invalid value: v1.WorkerGroupSpec{GroupName:"group1",
Replicas:(*int32)(0x40006e63cc), MinReplicas:(*int32)(0x40006e63c8),
MaxReplicas:(*int32)(0x40006e63c0), RayStartParams:map[string]string{},
Template:v1.PodTemplateSpec{ObjectMeta:v1.ObjectMeta{Name:"", GenerateName:"",
Namespace:"", SelfLink:"", UID:"", ResourceVersion:"", Generation:0,
CreationTimestamp:time.Date(1, time.January, 1, 0, 0, 0, 0, time.UTC),
DeletionTimestamp:<nil>, DeletionGracePeriodSeconds:(*int64)(nil),
Labels:map[string]string(nil), Annotations:map[string]string(nil),
OwnerReferences:[]v1.OwnerReference(nil), Finalizers:[]string(nil),
ClusterName:"", ManagedFields:[]v1.ManagedFieldsEntry(nil)},
Spec:v1.PodSpec{Volumes:[]v1.Volume(nil), InitContainers:[]v1.Container(nil),
Containers:[]v1.Container{v1.Container{Name:"ray-worker",
Image:"rayproject/ray:2.7.0", Command:[]string(nil), Args:[]string(nil),
WorkingDir:"", Ports:[]v1.ContainerPort(nil), EnvFrom:[]v1.EnvFromSource(nil),
Env:[]v1.EnvVar(nil),
Resources:v1.ResourceRequirements{Limits:v1.ResourceList(nil),
Requests:v1.ResourceList(nil)}, VolumeMounts:[]v1.VolumeMount(nil),
VolumeDevices:[]v1.VolumeDevice(nil), LivenessProbe:(*v1.Probe)(nil),
ReadinessProbe:(*v1.Probe)(nil), StartupProbe:(*v1.Probe)(nil),
Lifecycle:(*v1.Lifecycle)(nil), TerminationMessagePath:"",
TerminationMessagePolicy:"", ImagePullPolicy:"",
SecurityContext:(*v1.SecurityContext)(nil), Stdin:false, StdinOnce:false,
TTY:false}}, EphemeralContainers:[]v1.EphemeralContainer(nil),
RestartPolicy:"", TerminationGracePeriodSeconds:(*int64)(nil),
ActiveDeadlineSeconds:(*int64)(nil), DNSPolicy:"",
NodeSelector:map[string]string(nil), ServiceAccountName:"",
DeprecatedServiceAccount:"", AutomountServiceAccountToken:(*bool)(nil),
NodeName:"", HostNetwork:false, HostPID:false, HostIPC:false,
ShareProcessNamespace:(*bool)(nil),
SecurityContext:(*v1.PodSecurityContext)(nil),
ImagePullSecrets:[]v1.LocalObjectReference(nil), Hostname:"", Subdomain:"",
Affinity:(*v1.Affinity)(nil), SchedulerName:"",
Tolerations:[]v1.Toleration(nil), HostAliases:[]v1.HostAlias(nil),
PriorityClassName:"", Priority:(*int32)(nil),
DNSConfig:(*v1.PodDNSConfig)(nil), ReadinessGates:[]v1.PodReadinessGate(nil),
RuntimeClassName:(*string)(nil), EnableServiceLinks:(*bool)(nil),
PreemptionPolicy:(*v1.PreemptionPolicy)(nil), Overhead:v1.ResourceList(nil),
TopologySpreadConstraints:[]v1.TopologySpreadConstraint(nil),
SetHostnameAsFQDN:(*bool)(nil), OS:(*v1.PodOS)(nil)}},
ScaleStrategy:v1.ScaleStrategy{WorkersToDelete:[]string(nil)}}: worker group
names must be unique`

closes ray-project#718
closes ray-project#736

[1]: https://book.kubebuilder.io/cronjob-tutorial/webhook-implementation
davidxia added a commit to davidxia/kuberay that referenced this issue Oct 31, 2023
and validate worker group names are unique.

Much of the code was generated by running the command below as [documented in
kubebuilder][1].

```
kubebuilder create webhook \
  --group ray \
  --version v1 \
  --kind RayCluster \
  --defaulting \
  --programmatic-validation`
```

## Example RayCluster that has duplicate worker group names

```shell
cat dupe-worker-group-name.yaml

apiVersion: ray.io/v1
kind: RayCluster
metadata:
  name: dupe-worker-group-name
spec:
  headGroupSpec:
    rayStartParams:
      dashboard-host: '0.0.0.0'
    template:
      spec:
        containers:
        - name: ray-head
          image: rayproject/ray:2.7.0
  workerGroupSpecs:
  - replicas: 1
    minReplicas: 1
    maxReplicas: 10
    groupName: group1
    rayStartParams: {}
    template:
      spec:
        containers:
        - name: ray-worker
          image: rayproject/ray:2.7.0
  - replicas: 1
    minReplicas: 1
    maxReplicas: 10
    groupName: group1
    rayStartParams: {}
    template:
      spec:
        containers:
        - name: ray-worker
          image: rayproject/ray:2.7.0
```

## Before

```
kubectl apply -f dupe-worker-group-name.yaml
raycluster.ray.io/raycluster-dupe-worker-name created
```

## After

```
kubectl --context kind-kind apply -f config/samples/ray-cluster-dupe-worker-name.yaml                                                                                            (base)
```

`The RayCluster "raycluster-dupe-worker-name" is invalid:
spec.workerGroupSpecs[1]: Invalid value: v1.WorkerGroupSpec{GroupName:"group1",
Replicas:(*int32)(0x40006e63cc), MinReplicas:(*int32)(0x40006e63c8),
MaxReplicas:(*int32)(0x40006e63c0), RayStartParams:map[string]string{},
Template:v1.PodTemplateSpec{ObjectMeta:v1.ObjectMeta{Name:"", GenerateName:"",
Namespace:"", SelfLink:"", UID:"", ResourceVersion:"", Generation:0,
CreationTimestamp:time.Date(1, time.January, 1, 0, 0, 0, 0, time.UTC),
DeletionTimestamp:<nil>, DeletionGracePeriodSeconds:(*int64)(nil),
Labels:map[string]string(nil), Annotations:map[string]string(nil),
OwnerReferences:[]v1.OwnerReference(nil), Finalizers:[]string(nil),
ClusterName:"", ManagedFields:[]v1.ManagedFieldsEntry(nil)},
Spec:v1.PodSpec{Volumes:[]v1.Volume(nil), InitContainers:[]v1.Container(nil),
Containers:[]v1.Container{v1.Container{Name:"ray-worker",
Image:"rayproject/ray:2.7.0", Command:[]string(nil), Args:[]string(nil),
WorkingDir:"", Ports:[]v1.ContainerPort(nil), EnvFrom:[]v1.EnvFromSource(nil),
Env:[]v1.EnvVar(nil),
Resources:v1.ResourceRequirements{Limits:v1.ResourceList(nil),
Requests:v1.ResourceList(nil)}, VolumeMounts:[]v1.VolumeMount(nil),
VolumeDevices:[]v1.VolumeDevice(nil), LivenessProbe:(*v1.Probe)(nil),
ReadinessProbe:(*v1.Probe)(nil), StartupProbe:(*v1.Probe)(nil),
Lifecycle:(*v1.Lifecycle)(nil), TerminationMessagePath:"",
TerminationMessagePolicy:"", ImagePullPolicy:"",
SecurityContext:(*v1.SecurityContext)(nil), Stdin:false, StdinOnce:false,
TTY:false}}, EphemeralContainers:[]v1.EphemeralContainer(nil),
RestartPolicy:"", TerminationGracePeriodSeconds:(*int64)(nil),
ActiveDeadlineSeconds:(*int64)(nil), DNSPolicy:"",
NodeSelector:map[string]string(nil), ServiceAccountName:"",
DeprecatedServiceAccount:"", AutomountServiceAccountToken:(*bool)(nil),
NodeName:"", HostNetwork:false, HostPID:false, HostIPC:false,
ShareProcessNamespace:(*bool)(nil),
SecurityContext:(*v1.PodSecurityContext)(nil),
ImagePullSecrets:[]v1.LocalObjectReference(nil), Hostname:"", Subdomain:"",
Affinity:(*v1.Affinity)(nil), SchedulerName:"",
Tolerations:[]v1.Toleration(nil), HostAliases:[]v1.HostAlias(nil),
PriorityClassName:"", Priority:(*int32)(nil),
DNSConfig:(*v1.PodDNSConfig)(nil), ReadinessGates:[]v1.PodReadinessGate(nil),
RuntimeClassName:(*string)(nil), EnableServiceLinks:(*bool)(nil),
PreemptionPolicy:(*v1.PreemptionPolicy)(nil), Overhead:v1.ResourceList(nil),
TopologySpreadConstraints:[]v1.TopologySpreadConstraint(nil),
SetHostnameAsFQDN:(*bool)(nil), OS:(*v1.PodOS)(nil)}},
ScaleStrategy:v1.ScaleStrategy{WorkersToDelete:[]string(nil)}}: worker group
names must be unique`

closes ray-project#718
closes ray-project#736

[1]: https://book.kubebuilder.io/cronjob-tutorial/webhook-implementation
davidxia added a commit to davidxia/kuberay that referenced this issue Oct 31, 2023
and validate worker group names are unique.

Much of the code was generated by running the command below as [documented in
kubebuilder][1].

```
kubebuilder create webhook \
  --group ray \
  --version v1 \
  --kind RayCluster \
  --defaulting \
  --programmatic-validation`
```

## How to use locally

```shell
IMG=kuberay/operator:test make docker-build
kind load docker-image kuberay/operator:test
IMG=kuberay/operator:test make deploy-with-webhooks
```

## Example RayCluster that has duplicate worker group names

```shell
cat dupe-worker-group-name.yaml

apiVersion: ray.io/v1
kind: RayCluster
metadata:
  name: dupe-worker-group-name
spec:
  headGroupSpec:
    rayStartParams:
      dashboard-host: '0.0.0.0'
    template:
      spec:
        containers:
        - name: ray-head
          image: rayproject/ray:2.7.0
  workerGroupSpecs:
  - replicas: 1
    minReplicas: 1
    maxReplicas: 10
    groupName: group1
    rayStartParams: {}
    template:
      spec:
        containers:
        - name: ray-worker
          image: rayproject/ray:2.7.0
  - replicas: 1
    minReplicas: 1
    maxReplicas: 10
    groupName: group1
    rayStartParams: {}
    template:
      spec:
        containers:
        - name: ray-worker
          image: rayproject/ray:2.7.0
```

## Before

```
kubectl apply -f dupe-worker-group-name.yaml
raycluster.ray.io/raycluster-dupe-worker-name created
```

## After

```
kubectl --context kind-kind apply -f config/samples/ray-cluster-dupe-worker-name.yaml                                                                                            (base)
```

`The RayCluster "raycluster-dupe-worker-name" is invalid:
spec.workerGroupSpecs[1]: Invalid value: v1.WorkerGroupSpec{GroupName:"group1",
Replicas:(*int32)(0x40006e63cc), MinReplicas:(*int32)(0x40006e63c8),
MaxReplicas:(*int32)(0x40006e63c0), RayStartParams:map[string]string{},
Template:v1.PodTemplateSpec{ObjectMeta:v1.ObjectMeta{Name:"", GenerateName:"",
Namespace:"", SelfLink:"", UID:"", ResourceVersion:"", Generation:0,
CreationTimestamp:time.Date(1, time.January, 1, 0, 0, 0, 0, time.UTC),
DeletionTimestamp:<nil>, DeletionGracePeriodSeconds:(*int64)(nil),
Labels:map[string]string(nil), Annotations:map[string]string(nil),
OwnerReferences:[]v1.OwnerReference(nil), Finalizers:[]string(nil),
ClusterName:"", ManagedFields:[]v1.ManagedFieldsEntry(nil)},
Spec:v1.PodSpec{Volumes:[]v1.Volume(nil), InitContainers:[]v1.Container(nil),
Containers:[]v1.Container{v1.Container{Name:"ray-worker",
Image:"rayproject/ray:2.7.0", Command:[]string(nil), Args:[]string(nil),
WorkingDir:"", Ports:[]v1.ContainerPort(nil), EnvFrom:[]v1.EnvFromSource(nil),
Env:[]v1.EnvVar(nil),
Resources:v1.ResourceRequirements{Limits:v1.ResourceList(nil),
Requests:v1.ResourceList(nil)}, VolumeMounts:[]v1.VolumeMount(nil),
VolumeDevices:[]v1.VolumeDevice(nil), LivenessProbe:(*v1.Probe)(nil),
ReadinessProbe:(*v1.Probe)(nil), StartupProbe:(*v1.Probe)(nil),
Lifecycle:(*v1.Lifecycle)(nil), TerminationMessagePath:"",
TerminationMessagePolicy:"", ImagePullPolicy:"",
SecurityContext:(*v1.SecurityContext)(nil), Stdin:false, StdinOnce:false,
TTY:false}}, EphemeralContainers:[]v1.EphemeralContainer(nil),
RestartPolicy:"", TerminationGracePeriodSeconds:(*int64)(nil),
ActiveDeadlineSeconds:(*int64)(nil), DNSPolicy:"",
NodeSelector:map[string]string(nil), ServiceAccountName:"",
DeprecatedServiceAccount:"", AutomountServiceAccountToken:(*bool)(nil),
NodeName:"", HostNetwork:false, HostPID:false, HostIPC:false,
ShareProcessNamespace:(*bool)(nil),
SecurityContext:(*v1.PodSecurityContext)(nil),
ImagePullSecrets:[]v1.LocalObjectReference(nil), Hostname:"", Subdomain:"",
Affinity:(*v1.Affinity)(nil), SchedulerName:"",
Tolerations:[]v1.Toleration(nil), HostAliases:[]v1.HostAlias(nil),
PriorityClassName:"", Priority:(*int32)(nil),
DNSConfig:(*v1.PodDNSConfig)(nil), ReadinessGates:[]v1.PodReadinessGate(nil),
RuntimeClassName:(*string)(nil), EnableServiceLinks:(*bool)(nil),
PreemptionPolicy:(*v1.PreemptionPolicy)(nil), Overhead:v1.ResourceList(nil),
TopologySpreadConstraints:[]v1.TopologySpreadConstraint(nil),
SetHostnameAsFQDN:(*bool)(nil), OS:(*v1.PodOS)(nil)}},
ScaleStrategy:v1.ScaleStrategy{WorkersToDelete:[]string(nil)}}: worker group
names must be unique`

closes ray-project#718
closes ray-project#736

[1]: https://book.kubebuilder.io/cronjob-tutorial/webhook-implementation
davidxia added a commit to davidxia/kuberay that referenced this issue Oct 31, 2023
and validate worker group names are unique.

Much of the code was generated by running the command below as [documented in
kubebuilder][1].

```
kubebuilder create webhook \
  --group ray \
  --version v1 \
  --kind RayCluster \
  --defaulting \
  --programmatic-validation`
```

## How to use locally

```shell
IMG=kuberay/operator:test make docker-build
kind load docker-image kuberay/operator:test
IMG=kuberay/operator:test make deploy-with-webhooks
```

## Example RayCluster that has duplicate worker group names

```shell
cat dupe-worker-group-name.yaml

apiVersion: ray.io/v1
kind: RayCluster
metadata:
  name: dupe-worker-group-name
spec:
  headGroupSpec:
    rayStartParams:
      dashboard-host: '0.0.0.0'
    template:
      spec:
        containers:
        - name: ray-head
          image: rayproject/ray:2.7.0
  workerGroupSpecs:
  - replicas: 1
    minReplicas: 1
    maxReplicas: 10
    groupName: group1
    rayStartParams: {}
    template:
      spec:
        containers:
        - name: ray-worker
          image: rayproject/ray:2.7.0
  - replicas: 1
    minReplicas: 1
    maxReplicas: 10
    groupName: group1
    rayStartParams: {}
    template:
      spec:
        containers:
        - name: ray-worker
          image: rayproject/ray:2.7.0
```

## Before

```
kubectl apply -f dupe-worker-group-name.yaml
raycluster.ray.io/raycluster-dupe-worker-name created
```

## After

```
kubectl --context kind-kind apply -f config/samples/ray-cluster-dupe-worker-name.yaml                                                                                            (base)
```

`The RayCluster "raycluster-dupe-worker-name" is invalid:
spec.workerGroupSpecs[1]: Invalid value: v1.WorkerGroupSpec{GroupName:"group1",
Replicas:(*int32)(0x40006e63cc), MinReplicas:(*int32)(0x40006e63c8),
MaxReplicas:(*int32)(0x40006e63c0), RayStartParams:map[string]string{},
Template:v1.PodTemplateSpec{ObjectMeta:v1.ObjectMeta{Name:"", GenerateName:"",
Namespace:"", SelfLink:"", UID:"", ResourceVersion:"", Generation:0,
CreationTimestamp:time.Date(1, time.January, 1, 0, 0, 0, 0, time.UTC),
DeletionTimestamp:<nil>, DeletionGracePeriodSeconds:(*int64)(nil),
Labels:map[string]string(nil), Annotations:map[string]string(nil),
OwnerReferences:[]v1.OwnerReference(nil), Finalizers:[]string(nil),
ClusterName:"", ManagedFields:[]v1.ManagedFieldsEntry(nil)},
Spec:v1.PodSpec{Volumes:[]v1.Volume(nil), InitContainers:[]v1.Container(nil),
Containers:[]v1.Container{v1.Container{Name:"ray-worker",
Image:"rayproject/ray:2.7.0", Command:[]string(nil), Args:[]string(nil),
WorkingDir:"", Ports:[]v1.ContainerPort(nil), EnvFrom:[]v1.EnvFromSource(nil),
Env:[]v1.EnvVar(nil),
Resources:v1.ResourceRequirements{Limits:v1.ResourceList(nil),
Requests:v1.ResourceList(nil)}, VolumeMounts:[]v1.VolumeMount(nil),
VolumeDevices:[]v1.VolumeDevice(nil), LivenessProbe:(*v1.Probe)(nil),
ReadinessProbe:(*v1.Probe)(nil), StartupProbe:(*v1.Probe)(nil),
Lifecycle:(*v1.Lifecycle)(nil), TerminationMessagePath:"",
TerminationMessagePolicy:"", ImagePullPolicy:"",
SecurityContext:(*v1.SecurityContext)(nil), Stdin:false, StdinOnce:false,
TTY:false}}, EphemeralContainers:[]v1.EphemeralContainer(nil),
RestartPolicy:"", TerminationGracePeriodSeconds:(*int64)(nil),
ActiveDeadlineSeconds:(*int64)(nil), DNSPolicy:"",
NodeSelector:map[string]string(nil), ServiceAccountName:"",
DeprecatedServiceAccount:"", AutomountServiceAccountToken:(*bool)(nil),
NodeName:"", HostNetwork:false, HostPID:false, HostIPC:false,
ShareProcessNamespace:(*bool)(nil),
SecurityContext:(*v1.PodSecurityContext)(nil),
ImagePullSecrets:[]v1.LocalObjectReference(nil), Hostname:"", Subdomain:"",
Affinity:(*v1.Affinity)(nil), SchedulerName:"",
Tolerations:[]v1.Toleration(nil), HostAliases:[]v1.HostAlias(nil),
PriorityClassName:"", Priority:(*int32)(nil),
DNSConfig:(*v1.PodDNSConfig)(nil), ReadinessGates:[]v1.PodReadinessGate(nil),
RuntimeClassName:(*string)(nil), EnableServiceLinks:(*bool)(nil),
PreemptionPolicy:(*v1.PreemptionPolicy)(nil), Overhead:v1.ResourceList(nil),
TopologySpreadConstraints:[]v1.TopologySpreadConstraint(nil),
SetHostnameAsFQDN:(*bool)(nil), OS:(*v1.PodOS)(nil)}},
ScaleStrategy:v1.ScaleStrategy{WorkersToDelete:[]string(nil)}}: worker group
names must be unique`

closes ray-project#718
closes ray-project#736

[1]: https://book.kubebuilder.io/cronjob-tutorial/webhook-implementation
davidxia added a commit to davidxia/kuberay that referenced this issue Oct 31, 2023
and validate worker group names are unique.

Much of the code was generated by running the command below as [documented in
kubebuilder][1].

```
kubebuilder create webhook \
  --group ray \
  --version v1 \
  --kind RayCluster \
  --defaulting \
  --programmatic-validation`
```

## How to use locally

```shell
IMG=kuberay/operator:test make docker-build
kind load docker-image kuberay/operator:test
IMG=kuberay/operator:test make deploy-with-webhooks
```

## Example RayCluster that has duplicate worker group names

```shell
cat dupe-worker-group-name.yaml

apiVersion: ray.io/v1
kind: RayCluster
metadata:
  name: dupe-worker-group-name
spec:
  headGroupSpec:
    rayStartParams:
      dashboard-host: '0.0.0.0'
    template:
      spec:
        containers:
        - name: ray-head
          image: rayproject/ray:2.7.0
  workerGroupSpecs:
  - replicas: 1
    minReplicas: 1
    maxReplicas: 10
    groupName: group1
    rayStartParams: {}
    template:
      spec:
        containers:
        - name: ray-worker
          image: rayproject/ray:2.7.0
  - replicas: 1
    minReplicas: 1
    maxReplicas: 10
    groupName: group1
    rayStartParams: {}
    template:
      spec:
        containers:
        - name: ray-worker
          image: rayproject/ray:2.7.0
```

## Before

```
kubectl apply -f dupe-worker-group-name.yaml
raycluster.ray.io/raycluster-dupe-worker-name created
```

## After

```
kubectl --context kind-kind apply -f config/samples/ray-cluster-dupe-worker-name.yaml                                                                                            (base)
```

`The RayCluster "raycluster-dupe-worker-name" is invalid:
spec.workerGroupSpecs[1]: Invalid value: v1.WorkerGroupSpec{GroupName:"group1",
Replicas:(*int32)(0x40006e63cc), MinReplicas:(*int32)(0x40006e63c8),
MaxReplicas:(*int32)(0x40006e63c0), RayStartParams:map[string]string{},
Template:v1.PodTemplateSpec{ObjectMeta:v1.ObjectMeta{Name:"", GenerateName:"",
Namespace:"", SelfLink:"", UID:"", ResourceVersion:"", Generation:0,
CreationTimestamp:time.Date(1, time.January, 1, 0, 0, 0, 0, time.UTC),
DeletionTimestamp:<nil>, DeletionGracePeriodSeconds:(*int64)(nil),
Labels:map[string]string(nil), Annotations:map[string]string(nil),
OwnerReferences:[]v1.OwnerReference(nil), Finalizers:[]string(nil),
ClusterName:"", ManagedFields:[]v1.ManagedFieldsEntry(nil)},
Spec:v1.PodSpec{Volumes:[]v1.Volume(nil), InitContainers:[]v1.Container(nil),
Containers:[]v1.Container{v1.Container{Name:"ray-worker",
Image:"rayproject/ray:2.7.0", Command:[]string(nil), Args:[]string(nil),
WorkingDir:"", Ports:[]v1.ContainerPort(nil), EnvFrom:[]v1.EnvFromSource(nil),
Env:[]v1.EnvVar(nil),
Resources:v1.ResourceRequirements{Limits:v1.ResourceList(nil),
Requests:v1.ResourceList(nil)}, VolumeMounts:[]v1.VolumeMount(nil),
VolumeDevices:[]v1.VolumeDevice(nil), LivenessProbe:(*v1.Probe)(nil),
ReadinessProbe:(*v1.Probe)(nil), StartupProbe:(*v1.Probe)(nil),
Lifecycle:(*v1.Lifecycle)(nil), TerminationMessagePath:"",
TerminationMessagePolicy:"", ImagePullPolicy:"",
SecurityContext:(*v1.SecurityContext)(nil), Stdin:false, StdinOnce:false,
TTY:false}}, EphemeralContainers:[]v1.EphemeralContainer(nil),
RestartPolicy:"", TerminationGracePeriodSeconds:(*int64)(nil),
ActiveDeadlineSeconds:(*int64)(nil), DNSPolicy:"",
NodeSelector:map[string]string(nil), ServiceAccountName:"",
DeprecatedServiceAccount:"", AutomountServiceAccountToken:(*bool)(nil),
NodeName:"", HostNetwork:false, HostPID:false, HostIPC:false,
ShareProcessNamespace:(*bool)(nil),
SecurityContext:(*v1.PodSecurityContext)(nil),
ImagePullSecrets:[]v1.LocalObjectReference(nil), Hostname:"", Subdomain:"",
Affinity:(*v1.Affinity)(nil), SchedulerName:"",
Tolerations:[]v1.Toleration(nil), HostAliases:[]v1.HostAlias(nil),
PriorityClassName:"", Priority:(*int32)(nil),
DNSConfig:(*v1.PodDNSConfig)(nil), ReadinessGates:[]v1.PodReadinessGate(nil),
RuntimeClassName:(*string)(nil), EnableServiceLinks:(*bool)(nil),
PreemptionPolicy:(*v1.PreemptionPolicy)(nil), Overhead:v1.ResourceList(nil),
TopologySpreadConstraints:[]v1.TopologySpreadConstraint(nil),
SetHostnameAsFQDN:(*bool)(nil), OS:(*v1.PodOS)(nil)}},
ScaleStrategy:v1.ScaleStrategy{WorkersToDelete:[]string(nil)}}: worker group
names must be unique`

closes ray-project#718
closes ray-project#736

[1]: https://book.kubebuilder.io/cronjob-tutorial/webhook-implementation
davidxia added a commit to davidxia/kuberay that referenced this issue Oct 31, 2023
and validate worker group names are unique.

Add new Makefile targets `install-with-webhooks`, `uninstall-with-webhooks`,
`deploy-with-webhooks`, and `undeploy-with-webhooks` to be backwards compatible
and opt-in.

Much of the code was generated by running the command below as [documented in
kubebuilder][1].

```
kubebuilder create webhook \
  --group ray \
  --version v1 \
  --kind RayCluster \
  --defaulting \
  --programmatic-validation`
```

## How to use locally

```shell
IMG=kuberay/operator:test make docker-build
kind load docker-image kuberay/operator:test
IMG=kuberay/operator:test make deploy-with-webhooks
```

## Example RayCluster that has duplicate worker group names

```shell
cat dupe-worker-group-name.yaml

apiVersion: ray.io/v1
kind: RayCluster
metadata:
  name: dupe-worker-group-name
spec:
  headGroupSpec:
    rayStartParams:
      dashboard-host: '0.0.0.0'
    template:
      spec:
        containers:
        - name: ray-head
          image: rayproject/ray:2.7.0
  workerGroupSpecs:
  - replicas: 1
    minReplicas: 1
    maxReplicas: 10
    groupName: group1
    rayStartParams: {}
    template:
      spec:
        containers:
        - name: ray-worker
          image: rayproject/ray:2.7.0
  - replicas: 1
    minReplicas: 1
    maxReplicas: 10
    groupName: group1
    rayStartParams: {}
    template:
      spec:
        containers:
        - name: ray-worker
          image: rayproject/ray:2.7.0
```

## Before

```
kubectl apply -f dupe-worker-group-name.yaml
raycluster.ray.io/raycluster-dupe-worker-name created
```

## After

```
kubectl --context kind-kind apply -f config/samples/ray-cluster-dupe-worker-name.yaml
```

`The RayCluster "raycluster-dupe-worker-name" is invalid:
spec.workerGroupSpecs[1]: Invalid value: v1.WorkerGroupSpec{GroupName:"group1",
Replicas:(*int32)(0x40006e63cc), MinReplicas:(*int32)(0x40006e63c8),
MaxReplicas:(*int32)(0x40006e63c0), RayStartParams:map[string]string{},
Template:v1.PodTemplateSpec{ObjectMeta:v1.ObjectMeta{Name:"", GenerateName:"",
Namespace:"", SelfLink:"", UID:"", ResourceVersion:"", Generation:0,
CreationTimestamp:time.Date(1, time.January, 1, 0, 0, 0, 0, time.UTC),
DeletionTimestamp:<nil>, DeletionGracePeriodSeconds:(*int64)(nil),
Labels:map[string]string(nil), Annotations:map[string]string(nil),
OwnerReferences:[]v1.OwnerReference(nil), Finalizers:[]string(nil),
ClusterName:"", ManagedFields:[]v1.ManagedFieldsEntry(nil)},
Spec:v1.PodSpec{Volumes:[]v1.Volume(nil), InitContainers:[]v1.Container(nil),
Containers:[]v1.Container{v1.Container{Name:"ray-worker",
Image:"rayproject/ray:2.7.0", Command:[]string(nil), Args:[]string(nil),
WorkingDir:"", Ports:[]v1.ContainerPort(nil), EnvFrom:[]v1.EnvFromSource(nil),
Env:[]v1.EnvVar(nil),
Resources:v1.ResourceRequirements{Limits:v1.ResourceList(nil),
Requests:v1.ResourceList(nil)}, VolumeMounts:[]v1.VolumeMount(nil),
VolumeDevices:[]v1.VolumeDevice(nil), LivenessProbe:(*v1.Probe)(nil),
ReadinessProbe:(*v1.Probe)(nil), StartupProbe:(*v1.Probe)(nil),
Lifecycle:(*v1.Lifecycle)(nil), TerminationMessagePath:"",
TerminationMessagePolicy:"", ImagePullPolicy:"",
SecurityContext:(*v1.SecurityContext)(nil), Stdin:false, StdinOnce:false,
TTY:false}}, EphemeralContainers:[]v1.EphemeralContainer(nil),
RestartPolicy:"", TerminationGracePeriodSeconds:(*int64)(nil),
ActiveDeadlineSeconds:(*int64)(nil), DNSPolicy:"",
NodeSelector:map[string]string(nil), ServiceAccountName:"",
DeprecatedServiceAccount:"", AutomountServiceAccountToken:(*bool)(nil),
NodeName:"", HostNetwork:false, HostPID:false, HostIPC:false,
ShareProcessNamespace:(*bool)(nil),
SecurityContext:(*v1.PodSecurityContext)(nil),
ImagePullSecrets:[]v1.LocalObjectReference(nil), Hostname:"", Subdomain:"",
Affinity:(*v1.Affinity)(nil), SchedulerName:"",
Tolerations:[]v1.Toleration(nil), HostAliases:[]v1.HostAlias(nil),
PriorityClassName:"", Priority:(*int32)(nil),
DNSConfig:(*v1.PodDNSConfig)(nil), ReadinessGates:[]v1.PodReadinessGate(nil),
RuntimeClassName:(*string)(nil), EnableServiceLinks:(*bool)(nil),
PreemptionPolicy:(*v1.PreemptionPolicy)(nil), Overhead:v1.ResourceList(nil),
TopologySpreadConstraints:[]v1.TopologySpreadConstraint(nil),
SetHostnameAsFQDN:(*bool)(nil), OS:(*v1.PodOS)(nil)}},
ScaleStrategy:v1.ScaleStrategy{WorkersToDelete:[]string(nil)}}: worker group
names must be unique`

## Backwards compatibility

Just use original Makefile targets of `install`, `uninstall`, `deploy`, and
`undeploy`.

```shell
IMG=kuberay/operator:test make docker-build
kind load docker-image kuberay/operator:test
IMG=kuberay/operator:test make deploy

kubectl apply -f dupe-worker-group-name.yaml
raycluster.ray.io/raycluster-dupe-worker-name created
```
closes ray-project#718
closes ray-project#736

[1]: https://book.kubebuilder.io/cronjob-tutorial/webhook-implementation
davidxia added a commit to davidxia/kuberay that referenced this issue Nov 20, 2023
and validate worker group names are unique.

Add new Makefile targets `install-with-webhooks`, `uninstall-with-webhooks`,
`deploy-with-webhooks`, and `undeploy-with-webhooks` to be backwards compatible
and opt-in.

Much of the code was generated by running the command below as [documented in
kubebuilder][1].

```
kubebuilder create webhook \
  --group ray \
  --version v1 \
  --kind RayCluster \
  --defaulting \
  --programmatic-validation`
```

## How to use locally

```shell
IMG=kuberay/operator:test make docker-build
kind load docker-image kuberay/operator:test
IMG=kuberay/operator:test make deploy-with-webhooks
```

## Example RayCluster that has duplicate worker group names

```shell
cat dupe-worker-group-name.yaml

apiVersion: ray.io/v1
kind: RayCluster
metadata:
  name: dupe-worker-group-name
spec:
  headGroupSpec:
    rayStartParams:
      dashboard-host: '0.0.0.0'
    template:
      spec:
        containers:
        - name: ray-head
          image: rayproject/ray:2.7.0
  workerGroupSpecs:
  - replicas: 1
    minReplicas: 1
    maxReplicas: 10
    groupName: group1
    rayStartParams: {}
    template:
      spec:
        containers:
        - name: ray-worker
          image: rayproject/ray:2.7.0
  - replicas: 1
    minReplicas: 1
    maxReplicas: 10
    groupName: group1
    rayStartParams: {}
    template:
      spec:
        containers:
        - name: ray-worker
          image: rayproject/ray:2.7.0
```

## Before

```
kubectl apply -f dupe-worker-group-name.yaml
raycluster.ray.io/raycluster-dupe-worker-name created
```

## After

```
kubectl --context kind-kind apply -f config/samples/ray-cluster-dupe-worker-name.yaml
```

`The RayCluster "raycluster-dupe-worker-name" is invalid:
spec.workerGroupSpecs[1]: Invalid value: v1.WorkerGroupSpec{GroupName:"group1",
Replicas:(*int32)(0x40006e63cc), MinReplicas:(*int32)(0x40006e63c8),
MaxReplicas:(*int32)(0x40006e63c0), RayStartParams:map[string]string{},
Template:v1.PodTemplateSpec{ObjectMeta:v1.ObjectMeta{Name:"", GenerateName:"",
Namespace:"", SelfLink:"", UID:"", ResourceVersion:"", Generation:0,
CreationTimestamp:time.Date(1, time.January, 1, 0, 0, 0, 0, time.UTC),
DeletionTimestamp:<nil>, DeletionGracePeriodSeconds:(*int64)(nil),
Labels:map[string]string(nil), Annotations:map[string]string(nil),
OwnerReferences:[]v1.OwnerReference(nil), Finalizers:[]string(nil),
ClusterName:"", ManagedFields:[]v1.ManagedFieldsEntry(nil)},
Spec:v1.PodSpec{Volumes:[]v1.Volume(nil), InitContainers:[]v1.Container(nil),
Containers:[]v1.Container{v1.Container{Name:"ray-worker",
Image:"rayproject/ray:2.7.0", Command:[]string(nil), Args:[]string(nil),
WorkingDir:"", Ports:[]v1.ContainerPort(nil), EnvFrom:[]v1.EnvFromSource(nil),
Env:[]v1.EnvVar(nil),
Resources:v1.ResourceRequirements{Limits:v1.ResourceList(nil),
Requests:v1.ResourceList(nil)}, VolumeMounts:[]v1.VolumeMount(nil),
VolumeDevices:[]v1.VolumeDevice(nil), LivenessProbe:(*v1.Probe)(nil),
ReadinessProbe:(*v1.Probe)(nil), StartupProbe:(*v1.Probe)(nil),
Lifecycle:(*v1.Lifecycle)(nil), TerminationMessagePath:"",
TerminationMessagePolicy:"", ImagePullPolicy:"",
SecurityContext:(*v1.SecurityContext)(nil), Stdin:false, StdinOnce:false,
TTY:false}}, EphemeralContainers:[]v1.EphemeralContainer(nil),
RestartPolicy:"", TerminationGracePeriodSeconds:(*int64)(nil),
ActiveDeadlineSeconds:(*int64)(nil), DNSPolicy:"",
NodeSelector:map[string]string(nil), ServiceAccountName:"",
DeprecatedServiceAccount:"", AutomountServiceAccountToken:(*bool)(nil),
NodeName:"", HostNetwork:false, HostPID:false, HostIPC:false,
ShareProcessNamespace:(*bool)(nil),
SecurityContext:(*v1.PodSecurityContext)(nil),
ImagePullSecrets:[]v1.LocalObjectReference(nil), Hostname:"", Subdomain:"",
Affinity:(*v1.Affinity)(nil), SchedulerName:"",
Tolerations:[]v1.Toleration(nil), HostAliases:[]v1.HostAlias(nil),
PriorityClassName:"", Priority:(*int32)(nil),
DNSConfig:(*v1.PodDNSConfig)(nil), ReadinessGates:[]v1.PodReadinessGate(nil),
RuntimeClassName:(*string)(nil), EnableServiceLinks:(*bool)(nil),
PreemptionPolicy:(*v1.PreemptionPolicy)(nil), Overhead:v1.ResourceList(nil),
TopologySpreadConstraints:[]v1.TopologySpreadConstraint(nil),
SetHostnameAsFQDN:(*bool)(nil), OS:(*v1.PodOS)(nil)}},
ScaleStrategy:v1.ScaleStrategy{WorkersToDelete:[]string(nil)}}: worker group
names must be unique`

## Backwards compatibility

Just use original Makefile targets of `install`, `uninstall`, `deploy`, and
`undeploy`.

```shell
IMG=kuberay/operator:test make docker-build
kind load docker-image kuberay/operator:test
IMG=kuberay/operator:test make deploy

kubectl apply -f dupe-worker-group-name.yaml
raycluster.ray.io/raycluster-dupe-worker-name created
```
closes ray-project#718
closes ray-project#736

[1]: https://book.kubebuilder.io/cronjob-tutorial/webhook-implementation
davidxia added a commit to davidxia/kuberay that referenced this issue Nov 23, 2023
and validate worker group names are unique.

Add new Makefile targets `install-with-webhooks`, `uninstall-with-webhooks`,
`deploy-with-webhooks`, and `undeploy-with-webhooks` to be backwards compatible
and opt-in.

Much of the code was generated by running the command below as [documented in
kubebuilder][1].

```
kubebuilder create webhook \
  --group ray \
  --version v1 \
  --kind RayCluster \
  --defaulting \
  --programmatic-validation`
```

## How to use locally

```shell
IMG=kuberay/operator:test make docker-build
kind load docker-image kuberay/operator:test
IMG=kuberay/operator:test make deploy-with-webhooks
```

## Example RayCluster that has duplicate worker group names

```shell
cat dupe-worker-group-name.yaml

apiVersion: ray.io/v1
kind: RayCluster
metadata:
  name: dupe-worker-group-name
spec:
  headGroupSpec:
    rayStartParams:
      dashboard-host: '0.0.0.0'
    template:
      spec:
        containers:
        - name: ray-head
          image: rayproject/ray:2.7.0
  workerGroupSpecs:
  - replicas: 1
    minReplicas: 1
    maxReplicas: 10
    groupName: group1
    rayStartParams: {}
    template:
      spec:
        containers:
        - name: ray-worker
          image: rayproject/ray:2.7.0
  - replicas: 1
    minReplicas: 1
    maxReplicas: 10
    groupName: group1
    rayStartParams: {}
    template:
      spec:
        containers:
        - name: ray-worker
          image: rayproject/ray:2.7.0
```

## Before

```
kubectl apply -f dupe-worker-group-name.yaml
raycluster.ray.io/raycluster-dupe-worker-name created
```

## After

```
kubectl --context kind-kind apply -f config/samples/ray-cluster-dupe-worker-name.yaml
```

`The RayCluster "raycluster-dupe-worker-name" is invalid:
spec.workerGroupSpecs[1]: Invalid value: v1.WorkerGroupSpec{GroupName:"group1",
Replicas:(*int32)(0x40006e63cc), MinReplicas:(*int32)(0x40006e63c8),
MaxReplicas:(*int32)(0x40006e63c0), RayStartParams:map[string]string{},
Template:v1.PodTemplateSpec{ObjectMeta:v1.ObjectMeta{Name:"", GenerateName:"",
Namespace:"", SelfLink:"", UID:"", ResourceVersion:"", Generation:0,
CreationTimestamp:time.Date(1, time.January, 1, 0, 0, 0, 0, time.UTC),
DeletionTimestamp:<nil>, DeletionGracePeriodSeconds:(*int64)(nil),
Labels:map[string]string(nil), Annotations:map[string]string(nil),
OwnerReferences:[]v1.OwnerReference(nil), Finalizers:[]string(nil),
ClusterName:"", ManagedFields:[]v1.ManagedFieldsEntry(nil)},
Spec:v1.PodSpec{Volumes:[]v1.Volume(nil), InitContainers:[]v1.Container(nil),
Containers:[]v1.Container{v1.Container{Name:"ray-worker",
Image:"rayproject/ray:2.7.0", Command:[]string(nil), Args:[]string(nil),
WorkingDir:"", Ports:[]v1.ContainerPort(nil), EnvFrom:[]v1.EnvFromSource(nil),
Env:[]v1.EnvVar(nil),
Resources:v1.ResourceRequirements{Limits:v1.ResourceList(nil),
Requests:v1.ResourceList(nil)}, VolumeMounts:[]v1.VolumeMount(nil),
VolumeDevices:[]v1.VolumeDevice(nil), LivenessProbe:(*v1.Probe)(nil),
ReadinessProbe:(*v1.Probe)(nil), StartupProbe:(*v1.Probe)(nil),
Lifecycle:(*v1.Lifecycle)(nil), TerminationMessagePath:"",
TerminationMessagePolicy:"", ImagePullPolicy:"",
SecurityContext:(*v1.SecurityContext)(nil), Stdin:false, StdinOnce:false,
TTY:false}}, EphemeralContainers:[]v1.EphemeralContainer(nil),
RestartPolicy:"", TerminationGracePeriodSeconds:(*int64)(nil),
ActiveDeadlineSeconds:(*int64)(nil), DNSPolicy:"",
NodeSelector:map[string]string(nil), ServiceAccountName:"",
DeprecatedServiceAccount:"", AutomountServiceAccountToken:(*bool)(nil),
NodeName:"", HostNetwork:false, HostPID:false, HostIPC:false,
ShareProcessNamespace:(*bool)(nil),
SecurityContext:(*v1.PodSecurityContext)(nil),
ImagePullSecrets:[]v1.LocalObjectReference(nil), Hostname:"", Subdomain:"",
Affinity:(*v1.Affinity)(nil), SchedulerName:"",
Tolerations:[]v1.Toleration(nil), HostAliases:[]v1.HostAlias(nil),
PriorityClassName:"", Priority:(*int32)(nil),
DNSConfig:(*v1.PodDNSConfig)(nil), ReadinessGates:[]v1.PodReadinessGate(nil),
RuntimeClassName:(*string)(nil), EnableServiceLinks:(*bool)(nil),
PreemptionPolicy:(*v1.PreemptionPolicy)(nil), Overhead:v1.ResourceList(nil),
TopologySpreadConstraints:[]v1.TopologySpreadConstraint(nil),
SetHostnameAsFQDN:(*bool)(nil), OS:(*v1.PodOS)(nil)}},
ScaleStrategy:v1.ScaleStrategy{WorkersToDelete:[]string(nil)}}: worker group
names must be unique`

## Backwards compatibility

Just use original Makefile targets of `install`, `uninstall`, `deploy`, and
`undeploy`.

```shell
IMG=kuberay/operator:test make docker-build
kind load docker-image kuberay/operator:test
IMG=kuberay/operator:test make deploy

kubectl apply -f dupe-worker-group-name.yaml
raycluster.ray.io/raycluster-dupe-worker-name created
```
closes ray-project#718
closes ray-project#736

[1]: https://book.kubebuilder.io/cronjob-tutorial/webhook-implementation
davidxia added a commit to davidxia/kuberay that referenced this issue Nov 23, 2023
and validate worker group names are unique.

Add new Makefile targets `install-with-webhooks`, `uninstall-with-webhooks`,
`deploy-with-webhooks`, and `undeploy-with-webhooks` to be backwards compatible
and opt-in.

Much of the code was generated by running the command below as [documented in
kubebuilder][1].

```
kubebuilder create webhook \
  --group ray \
  --version v1 \
  --kind RayCluster \
  --defaulting \
  --programmatic-validation`
```

## How to use locally

```shell
IMG=kuberay/operator:test make docker-build
kind load docker-image kuberay/operator:test
IMG=kuberay/operator:test make deploy-with-webhooks
```

## Example RayCluster that has duplicate worker group names

```shell
cat dupe-worker-group-name.yaml

apiVersion: ray.io/v1
kind: RayCluster
metadata:
  name: dupe-worker-group-name
spec:
  headGroupSpec:
    rayStartParams:
      dashboard-host: '0.0.0.0'
    template:
      spec:
        containers:
        - name: ray-head
          image: rayproject/ray:2.7.0
  workerGroupSpecs:
  - replicas: 1
    minReplicas: 1
    maxReplicas: 10
    groupName: group1
    rayStartParams: {}
    template:
      spec:
        containers:
        - name: ray-worker
          image: rayproject/ray:2.7.0
  - replicas: 1
    minReplicas: 1
    maxReplicas: 10
    groupName: group1
    rayStartParams: {}
    template:
      spec:
        containers:
        - name: ray-worker
          image: rayproject/ray:2.7.0
```

## Before

```
kubectl apply -f dupe-worker-group-name.yaml
raycluster.ray.io/raycluster-dupe-worker-name created
```

## After

```
kubectl --context kind-kind apply -f config/samples/ray-cluster-dupe-worker-name.yaml
```

`The RayCluster "raycluster-dupe-worker-name" is invalid:
spec.workerGroupSpecs[1]: Invalid value: v1.WorkerGroupSpec{GroupName:"group1",
Replicas:(*int32)(0x40006e63cc), MinReplicas:(*int32)(0x40006e63c8),
MaxReplicas:(*int32)(0x40006e63c0), RayStartParams:map[string]string{},
Template:v1.PodTemplateSpec{ObjectMeta:v1.ObjectMeta{Name:"", GenerateName:"",
Namespace:"", SelfLink:"", UID:"", ResourceVersion:"", Generation:0,
CreationTimestamp:time.Date(1, time.January, 1, 0, 0, 0, 0, time.UTC),
DeletionTimestamp:<nil>, DeletionGracePeriodSeconds:(*int64)(nil),
Labels:map[string]string(nil), Annotations:map[string]string(nil),
OwnerReferences:[]v1.OwnerReference(nil), Finalizers:[]string(nil),
ClusterName:"", ManagedFields:[]v1.ManagedFieldsEntry(nil)},
Spec:v1.PodSpec{Volumes:[]v1.Volume(nil), InitContainers:[]v1.Container(nil),
Containers:[]v1.Container{v1.Container{Name:"ray-worker",
Image:"rayproject/ray:2.7.0", Command:[]string(nil), Args:[]string(nil),
WorkingDir:"", Ports:[]v1.ContainerPort(nil), EnvFrom:[]v1.EnvFromSource(nil),
Env:[]v1.EnvVar(nil),
Resources:v1.ResourceRequirements{Limits:v1.ResourceList(nil),
Requests:v1.ResourceList(nil)}, VolumeMounts:[]v1.VolumeMount(nil),
VolumeDevices:[]v1.VolumeDevice(nil), LivenessProbe:(*v1.Probe)(nil),
ReadinessProbe:(*v1.Probe)(nil), StartupProbe:(*v1.Probe)(nil),
Lifecycle:(*v1.Lifecycle)(nil), TerminationMessagePath:"",
TerminationMessagePolicy:"", ImagePullPolicy:"",
SecurityContext:(*v1.SecurityContext)(nil), Stdin:false, StdinOnce:false,
TTY:false}}, EphemeralContainers:[]v1.EphemeralContainer(nil),
RestartPolicy:"", TerminationGracePeriodSeconds:(*int64)(nil),
ActiveDeadlineSeconds:(*int64)(nil), DNSPolicy:"",
NodeSelector:map[string]string(nil), ServiceAccountName:"",
DeprecatedServiceAccount:"", AutomountServiceAccountToken:(*bool)(nil),
NodeName:"", HostNetwork:false, HostPID:false, HostIPC:false,
ShareProcessNamespace:(*bool)(nil),
SecurityContext:(*v1.PodSecurityContext)(nil),
ImagePullSecrets:[]v1.LocalObjectReference(nil), Hostname:"", Subdomain:"",
Affinity:(*v1.Affinity)(nil), SchedulerName:"",
Tolerations:[]v1.Toleration(nil), HostAliases:[]v1.HostAlias(nil),
PriorityClassName:"", Priority:(*int32)(nil),
DNSConfig:(*v1.PodDNSConfig)(nil), ReadinessGates:[]v1.PodReadinessGate(nil),
RuntimeClassName:(*string)(nil), EnableServiceLinks:(*bool)(nil),
PreemptionPolicy:(*v1.PreemptionPolicy)(nil), Overhead:v1.ResourceList(nil),
TopologySpreadConstraints:[]v1.TopologySpreadConstraint(nil),
SetHostnameAsFQDN:(*bool)(nil), OS:(*v1.PodOS)(nil)}},
ScaleStrategy:v1.ScaleStrategy{WorkersToDelete:[]string(nil)}}: worker group
names must be unique`

## Backwards compatibility

Just use original Makefile targets of `install`, `uninstall`, `deploy`, and
`undeploy`.

```shell
IMG=kuberay/operator:test make docker-build
kind load docker-image kuberay/operator:test
IMG=kuberay/operator:test make deploy

kubectl apply -f dupe-worker-group-name.yaml
raycluster.ray.io/raycluster-dupe-worker-name created
```
closes ray-project#718
closes ray-project#736

[1]: https://book.kubebuilder.io/cronjob-tutorial/webhook-implementation
davidxia added a commit to davidxia/kuberay that referenced this issue Nov 23, 2023
and validate worker group names are unique.

Add new Makefile targets `install-with-webhooks`, `uninstall-with-webhooks`,
`deploy-with-webhooks`, and `undeploy-with-webhooks` to be backwards compatible
and opt-in.

Much of the code was generated by running the command below as [documented in
kubebuilder][1].

```
kubebuilder create webhook \
  --group ray \
  --version v1 \
  --kind RayCluster \
  --defaulting \
  --programmatic-validation`
```

## How to use locally

```shell
IMG=kuberay/operator:test make docker-build
kind load docker-image kuberay/operator:test
IMG=kuberay/operator:test make deploy-with-webhooks
```

## Example RayCluster that has duplicate worker group names

```shell
cat dupe-worker-group-name.yaml

apiVersion: ray.io/v1
kind: RayCluster
metadata:
  name: dupe-worker-group-name
spec:
  headGroupSpec:
    rayStartParams:
      dashboard-host: '0.0.0.0'
    template:
      spec:
        containers:
        - name: ray-head
          image: rayproject/ray:2.7.0
  workerGroupSpecs:
  - replicas: 1
    minReplicas: 1
    maxReplicas: 10
    groupName: group1
    rayStartParams: {}
    template:
      spec:
        containers:
        - name: ray-worker
          image: rayproject/ray:2.7.0
  - replicas: 1
    minReplicas: 1
    maxReplicas: 10
    groupName: group1
    rayStartParams: {}
    template:
      spec:
        containers:
        - name: ray-worker
          image: rayproject/ray:2.7.0
```

## Before

```
kubectl apply -f dupe-worker-group-name.yaml
raycluster.ray.io/raycluster-dupe-worker-name created
```

## After

```
kubectl --context kind-kind apply -f config/samples/ray-cluster-dupe-worker-name.yaml
```

`The RayCluster "raycluster-dupe-worker-name" is invalid:
spec.workerGroupSpecs[1]: Invalid value: v1.WorkerGroupSpec{GroupName:"group1",
Replicas:(*int32)(0x40006e63cc), MinReplicas:(*int32)(0x40006e63c8),
MaxReplicas:(*int32)(0x40006e63c0), RayStartParams:map[string]string{},
Template:v1.PodTemplateSpec{ObjectMeta:v1.ObjectMeta{Name:"", GenerateName:"",
Namespace:"", SelfLink:"", UID:"", ResourceVersion:"", Generation:0,
CreationTimestamp:time.Date(1, time.January, 1, 0, 0, 0, 0, time.UTC),
DeletionTimestamp:<nil>, DeletionGracePeriodSeconds:(*int64)(nil),
Labels:map[string]string(nil), Annotations:map[string]string(nil),
OwnerReferences:[]v1.OwnerReference(nil), Finalizers:[]string(nil),
ClusterName:"", ManagedFields:[]v1.ManagedFieldsEntry(nil)},
Spec:v1.PodSpec{Volumes:[]v1.Volume(nil), InitContainers:[]v1.Container(nil),
Containers:[]v1.Container{v1.Container{Name:"ray-worker",
Image:"rayproject/ray:2.7.0", Command:[]string(nil), Args:[]string(nil),
WorkingDir:"", Ports:[]v1.ContainerPort(nil), EnvFrom:[]v1.EnvFromSource(nil),
Env:[]v1.EnvVar(nil),
Resources:v1.ResourceRequirements{Limits:v1.ResourceList(nil),
Requests:v1.ResourceList(nil)}, VolumeMounts:[]v1.VolumeMount(nil),
VolumeDevices:[]v1.VolumeDevice(nil), LivenessProbe:(*v1.Probe)(nil),
ReadinessProbe:(*v1.Probe)(nil), StartupProbe:(*v1.Probe)(nil),
Lifecycle:(*v1.Lifecycle)(nil), TerminationMessagePath:"",
TerminationMessagePolicy:"", ImagePullPolicy:"",
SecurityContext:(*v1.SecurityContext)(nil), Stdin:false, StdinOnce:false,
TTY:false}}, EphemeralContainers:[]v1.EphemeralContainer(nil),
RestartPolicy:"", TerminationGracePeriodSeconds:(*int64)(nil),
ActiveDeadlineSeconds:(*int64)(nil), DNSPolicy:"",
NodeSelector:map[string]string(nil), ServiceAccountName:"",
DeprecatedServiceAccount:"", AutomountServiceAccountToken:(*bool)(nil),
NodeName:"", HostNetwork:false, HostPID:false, HostIPC:false,
ShareProcessNamespace:(*bool)(nil),
SecurityContext:(*v1.PodSecurityContext)(nil),
ImagePullSecrets:[]v1.LocalObjectReference(nil), Hostname:"", Subdomain:"",
Affinity:(*v1.Affinity)(nil), SchedulerName:"",
Tolerations:[]v1.Toleration(nil), HostAliases:[]v1.HostAlias(nil),
PriorityClassName:"", Priority:(*int32)(nil),
DNSConfig:(*v1.PodDNSConfig)(nil), ReadinessGates:[]v1.PodReadinessGate(nil),
RuntimeClassName:(*string)(nil), EnableServiceLinks:(*bool)(nil),
PreemptionPolicy:(*v1.PreemptionPolicy)(nil), Overhead:v1.ResourceList(nil),
TopologySpreadConstraints:[]v1.TopologySpreadConstraint(nil),
SetHostnameAsFQDN:(*bool)(nil), OS:(*v1.PodOS)(nil)}},
ScaleStrategy:v1.ScaleStrategy{WorkersToDelete:[]string(nil)}}: worker group
names must be unique`

## Backwards compatibility

Just use original Makefile targets of `install`, `uninstall`, `deploy`, and
`undeploy`.

```shell
IMG=kuberay/operator:test make docker-build
kind load docker-image kuberay/operator:test
IMG=kuberay/operator:test make deploy

kubectl apply -f dupe-worker-group-name.yaml
raycluster.ray.io/raycluster-dupe-worker-name created
```
closes ray-project#718
closes ray-project#736

[1]: https://book.kubebuilder.io/cronjob-tutorial/webhook-implementation
davidxia added a commit to davidxia/kuberay that referenced this issue Nov 23, 2023
and validate worker group names are unique.

Add new Makefile targets `install-with-webhooks`, `uninstall-with-webhooks`,
`deploy-with-webhooks`, and `undeploy-with-webhooks` to be backwards compatible
and opt-in.

Much of the code, especially the YAML files, was generated by running the
command below as [documented in kubebuilder][1].

```
kubebuilder create webhook \
  --group ray \
  --version v1 \
  --kind RayCluster \
  --defaulting \
  --programmatic-validation
```

## How to use locally

```shell
make manifests generate
make install-with-webhooks
IMG=kuberay/operator:test make docker-build
kind load docker-image kuberay/operator:test
IMG=kuberay/operator:test make deploy-with-webhooks
```

## Example RayCluster that has duplicate worker group names

```shell
cat dupe-worker-group-name.yaml

apiVersion: ray.io/v1
kind: RayCluster
metadata:
  name: dupe-worker-group-name
spec:
  headGroupSpec:
    rayStartParams:
      dashboard-host: '0.0.0.0'
    template:
      spec:
        containers:
        - name: ray-head
          image: rayproject/ray:2.7.0
  workerGroupSpecs:
  - replicas: 1
    minReplicas: 1
    maxReplicas: 10
    groupName: group1
    rayStartParams: {}
    template:
      spec:
        containers:
        - name: ray-worker
          image: rayproject/ray:2.7.0
  - replicas: 1
    minReplicas: 1
    maxReplicas: 10
    groupName: group1
    rayStartParams: {}
    template:
      spec:
        containers:
        - name: ray-worker
          image: rayproject/ray:2.7.0
```

## Before

```
kubectl apply -f dupe-worker-group-name.yaml
raycluster.ray.io/raycluster-dupe-worker-name created
```

## After

```
kubectl --context kind-kind apply -f config/samples/ray-cluster-dupe-worker-name.yaml
```

`The RayCluster "raycluster-dupe-worker-name" is invalid:
spec.workerGroupSpecs[1]: Invalid value: v1.WorkerGroupSpec{GroupName:"group1",
Replicas:(*int32)(0x40006e63cc), MinReplicas:(*int32)(0x40006e63c8),
MaxReplicas:(*int32)(0x40006e63c0), RayStartParams:map[string]string{},
Template:v1.PodTemplateSpec{ObjectMeta:v1.ObjectMeta{Name:"", GenerateName:"",
Namespace:"", SelfLink:"", UID:"", ResourceVersion:"", Generation:0,
CreationTimestamp:time.Date(1, time.January, 1, 0, 0, 0, 0, time.UTC),
DeletionTimestamp:<nil>, DeletionGracePeriodSeconds:(*int64)(nil),
Labels:map[string]string(nil), Annotations:map[string]string(nil),
OwnerReferences:[]v1.OwnerReference(nil), Finalizers:[]string(nil),
ClusterName:"", ManagedFields:[]v1.ManagedFieldsEntry(nil)},
Spec:v1.PodSpec{Volumes:[]v1.Volume(nil), InitContainers:[]v1.Container(nil),
Containers:[]v1.Container{v1.Container{Name:"ray-worker",
Image:"rayproject/ray:2.7.0", Command:[]string(nil), Args:[]string(nil),
WorkingDir:"", Ports:[]v1.ContainerPort(nil), EnvFrom:[]v1.EnvFromSource(nil),
Env:[]v1.EnvVar(nil),
Resources:v1.ResourceRequirements{Limits:v1.ResourceList(nil),
Requests:v1.ResourceList(nil)}, VolumeMounts:[]v1.VolumeMount(nil),
VolumeDevices:[]v1.VolumeDevice(nil), LivenessProbe:(*v1.Probe)(nil),
ReadinessProbe:(*v1.Probe)(nil), StartupProbe:(*v1.Probe)(nil),
Lifecycle:(*v1.Lifecycle)(nil), TerminationMessagePath:"",
TerminationMessagePolicy:"", ImagePullPolicy:"",
SecurityContext:(*v1.SecurityContext)(nil), Stdin:false, StdinOnce:false,
TTY:false}}, EphemeralContainers:[]v1.EphemeralContainer(nil),
RestartPolicy:"", TerminationGracePeriodSeconds:(*int64)(nil),
ActiveDeadlineSeconds:(*int64)(nil), DNSPolicy:"",
NodeSelector:map[string]string(nil), ServiceAccountName:"",
DeprecatedServiceAccount:"", AutomountServiceAccountToken:(*bool)(nil),
NodeName:"", HostNetwork:false, HostPID:false, HostIPC:false,
ShareProcessNamespace:(*bool)(nil),
SecurityContext:(*v1.PodSecurityContext)(nil),
ImagePullSecrets:[]v1.LocalObjectReference(nil), Hostname:"", Subdomain:"",
Affinity:(*v1.Affinity)(nil), SchedulerName:"",
Tolerations:[]v1.Toleration(nil), HostAliases:[]v1.HostAlias(nil),
PriorityClassName:"", Priority:(*int32)(nil),
DNSConfig:(*v1.PodDNSConfig)(nil), ReadinessGates:[]v1.PodReadinessGate(nil),
RuntimeClassName:(*string)(nil), EnableServiceLinks:(*bool)(nil),
PreemptionPolicy:(*v1.PreemptionPolicy)(nil), Overhead:v1.ResourceList(nil),
TopologySpreadConstraints:[]v1.TopologySpreadConstraint(nil),
SetHostnameAsFQDN:(*bool)(nil), OS:(*v1.PodOS)(nil)}},
ScaleStrategy:v1.ScaleStrategy{WorkersToDelete:[]string(nil)}}: worker group
names must be unique`

## Backwards compatibility

Just use original Makefile targets of `install`, `uninstall`, `deploy`, and
`undeploy`.

```shell
IMG=kuberay/operator:test make docker-build
kind load docker-image kuberay/operator:test
IMG=kuberay/operator:test make deploy

kubectl apply -f dupe-worker-group-name.yaml
raycluster.ray.io/raycluster-dupe-worker-name created
```
closes ray-project#718
closes ray-project#736

[1]: https://book.kubebuilder.io/cronjob-tutorial/webhook-implementation
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working observability P1 Issue that should be fixed within a few weeks
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants