-
Notifications
You must be signed in to change notification settings - Fork 402
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Sample YAML] Bump ray version in pod security YAML to 2.4.0 #1160
[Sample YAML] Bump ray version in pod security YAML to 2.4.0 #1160
Conversation
Signed-off-by: Archit Kulkarni <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are using Ray 2.2.0 here due to a dependency issue related to protobuf. Please refer to issue #873 for more details. The issue was resolved by the Ray community by manually updating the Docker images for Ray 2.2.0. However, Docker images for Ray 2.3.0 do not fix this issue manually. It looks like this issue was resolved by Ray 2.4.0.
…ject#1160) The existing sample YAML was pinned to Ray 2.2.0. Running the test locally failed with 2023-06-12:14:21:56,768 INFO [utils.py:163] Execute command: kubectl logs -n=pod-security -l ray.io/node-type=head --tail=-1 Error from server (BadRequest): container "ray-head" in pod "raycluster-pod-security-head-hg67c" is waiting to start: ContainerCreating ERROR ====================================================================== ERROR: test_ray_cluster_with_security_context (__main__.PodSecurityTestCase) Create a RayCluster with securityContext config under restricted mode. ---------------------------------------------------------------------- Traceback (most recent call last): File "tests/test_security.py", line 98, in test_ray_cluster_with_security_context ray_cluster_add_event.trigger() File "/Users/archit/kuberay/tests/framework/prototype.py", line 165, in trigger self.wait() File "/Users/archit/kuberay/tests/framework/prototype.py", line 277, in wait show_cluster_info(self.namespace) File "/Users/archit/kuberay/tests/framework/prototype.py", line 90, in show_cluster_info shell_subprocess_run(f'kubectl logs -n={cr_namespace} -l ray.io/node-type=head --tail=-1') File "/Users/archit/kuberay/tests/framework/utils.py", line 164, in shell_subprocess_run return subprocess.run(command, shell = True, check = check).returncode File "/Users/archit/anaconda3/envs/ray-py38/lib/python3.8/subprocess.py", line 516, in run raise CalledProcessError(retcode, process.args, subprocess.CalledProcessError: Command 'kubectl logs -n=pod-security -l ray.io/node-type=head --tail=-1' returned non-zero exit status 1. ---------------------------------------------------------------------- Ran 2 tests in 1189.059s FAILED (errors=1) It's possible this is just a race condition in the test, but it should be updated to Ray 2.4.0 regardless. I tested it locally with Ray 2.4.0 and it passes. ---------------------------------------------------------------------- Ran 2 tests in 900.921s OK This PR will also be cherry-picked to the 0.5.2 release branch. Signed-off-by: Archit Kulkarni <[email protected]>
…1161) The existing sample YAML was pinned to Ray 2.2.0. Running the test locally failed with 2023-06-12:14:21:56,768 INFO [utils.py:163] Execute command: kubectl logs -n=pod-security -l ray.io/node-type=head --tail=-1 Error from server (BadRequest): container "ray-head" in pod "raycluster-pod-security-head-hg67c" is waiting to start: ContainerCreating ERROR ====================================================================== ERROR: test_ray_cluster_with_security_context (__main__.PodSecurityTestCase) Create a RayCluster with securityContext config under restricted mode. ---------------------------------------------------------------------- Traceback (most recent call last): File "tests/test_security.py", line 98, in test_ray_cluster_with_security_context ray_cluster_add_event.trigger() File "/Users/archit/kuberay/tests/framework/prototype.py", line 165, in trigger self.wait() File "/Users/archit/kuberay/tests/framework/prototype.py", line 277, in wait show_cluster_info(self.namespace) File "/Users/archit/kuberay/tests/framework/prototype.py", line 90, in show_cluster_info shell_subprocess_run(f'kubectl logs -n={cr_namespace} -l ray.io/node-type=head --tail=-1') File "/Users/archit/kuberay/tests/framework/utils.py", line 164, in shell_subprocess_run return subprocess.run(command, shell = True, check = check).returncode File "/Users/archit/anaconda3/envs/ray-py38/lib/python3.8/subprocess.py", line 516, in run raise CalledProcessError(retcode, process.args, subprocess.CalledProcessError: Command 'kubectl logs -n=pod-security -l ray.io/node-type=head --tail=-1' returned non-zero exit status 1. ---------------------------------------------------------------------- Ran 2 tests in 1189.059s FAILED (errors=1) It's possible this is just a race condition in the test, but it should be updated to Ray 2.4.0 regardless. I tested it locally with Ray 2.4.0 and it passes. ---------------------------------------------------------------------- Ran 2 tests in 900.921s OK This PR will also be cherry-picked to the 0.5.2 release branch. Signed-off-by: Archit Kulkarni <[email protected]>
…ject#1160) The existing sample YAML was pinned to Ray 2.2.0. Running the test locally failed with 2023-06-12:14:21:56,768 INFO [utils.py:163] Execute command: kubectl logs -n=pod-security -l ray.io/node-type=head --tail=-1 Error from server (BadRequest): container "ray-head" in pod "raycluster-pod-security-head-hg67c" is waiting to start: ContainerCreating ERROR ====================================================================== ERROR: test_ray_cluster_with_security_context (__main__.PodSecurityTestCase) Create a RayCluster with securityContext config under restricted mode. ---------------------------------------------------------------------- Traceback (most recent call last): File "tests/test_security.py", line 98, in test_ray_cluster_with_security_context ray_cluster_add_event.trigger() File "/Users/archit/kuberay/tests/framework/prototype.py", line 165, in trigger self.wait() File "/Users/archit/kuberay/tests/framework/prototype.py", line 277, in wait show_cluster_info(self.namespace) File "/Users/archit/kuberay/tests/framework/prototype.py", line 90, in show_cluster_info shell_subprocess_run(f'kubectl logs -n={cr_namespace} -l ray.io/node-type=head --tail=-1') File "/Users/archit/kuberay/tests/framework/utils.py", line 164, in shell_subprocess_run return subprocess.run(command, shell = True, check = check).returncode File "/Users/archit/anaconda3/envs/ray-py38/lib/python3.8/subprocess.py", line 516, in run raise CalledProcessError(retcode, process.args, subprocess.CalledProcessError: Command 'kubectl logs -n=pod-security -l ray.io/node-type=head --tail=-1' returned non-zero exit status 1. ---------------------------------------------------------------------- Ran 2 tests in 1189.059s FAILED (errors=1) It's possible this is just a race condition in the test, but it should be updated to Ray 2.4.0 regardless. I tested it locally with Ray 2.4.0 and it passes. ---------------------------------------------------------------------- Ran 2 tests in 900.921s OK This PR will also be cherry-picked to the 0.5.2 release branch. Signed-off-by: Archit Kulkarni <[email protected]>
Why are these changes needed?
The existing sample YAML was pinned to Ray 2.2.0. Running the test locally failed with
It's possible this is just a race condition in the test, but it should be updated to Ray 2.4.0 regardless. I tested it locally with Ray 2.4.0 and it passes.
This PR will also be cherry-picked to the 0.5.2 release branch.
Related issue number
Checks