Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support AWS Graviton instances #1586

Closed
WoosukKwon opened this issue Jan 13, 2023 · 12 comments
Closed

Support AWS Graviton instances #1586

WoosukKwon opened this issue Jan 13, 2023 · 12 comments
Labels
enhancement New feature or request Stale

Comments

@WoosukKwon
Copy link
Collaborator

Now that the Ray has started to provide PyPI wheels for ARM64 CPUs (ray-project/ray#31566), we can also add official support for AWS Graviton instances. In the future, we will further be able to support ARM machines in other clouds such GCP T2A and Azure Dpsv5.

@WoosukKwon WoosukKwon self-assigned this Jan 13, 2023
@franklsf95
Copy link
Contributor

Woohoo!!

@WoosukKwon
Copy link
Collaborator Author

Just for note: This is currently blocked by #1618 (because the ARM PyPI wheels are only available for Ray v2.2) and #1616 (because those are the only AMIs that support ARM instances).

@Michaelvll Michaelvll added the enhancement New feature or request label Jan 28, 2023
@romilbhardwaj
Copy link
Collaborator

Bumping this - raised again by user in #1885 and also useful for k8s dev work on apple silicon.

@franklsf95
Copy link
Contributor

I should mention that Graviton and Apple Silicon are not the same architecture. Ray has an M1 build but not for Graviton.

@romilbhardwaj
Copy link
Collaborator

With #1734 in, we should be able to support Graviton now. Ray 2.4.0 works out of the box (pip install ray) on a graviton instance:

ubuntu@ip-172-31-56-38:~$ python3
Python 3.10.6 (main, Mar 10 2023, 10:55:28) [GCC 11.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import ray
>>> ray.init()
2023-05-26 19:07:15,178	INFO worker.py:1625 -- Started a local Ray instance.
RayContext(dashboard_url='', python_version='3.10.6', ray_version='2.4.0', ray_commit='4479f66d4db967d3c9dd0af2572061276ba926ba', address_info={'node_ip_address': '172.31.56.38', 'raylet_ip_address': '172.31.56.38', 'redis_address': None, 'object_store_address': '/tmp/ray/session_2023-05-26_19-07-12_299295_2702/sockets/plasma_store', 'raylet_socket_name': '/tmp/ray/session_2023-05-26_19-07-12_299295_2702/sockets/raylet', 'webui_url': '', 'session_dir': '/tmp/ray/session_2023-05-26_19-07-12_299295_2702', 'metrics_export_port': 61955, 'gcs_address': '172.31.56.38:54315', 'address': '172.31.56.38:54315', 'dashboard_agent_listen_port': 52365, 'node_id': '5b0e1d985f8a06e0eaa70ece6813c40156b40554c5ad164aec80acb7'})
>>> ray.available_resources()
{'CPU': 1.0, 'object_store_memory': 1109234073.0, 'node:172.31.56.38': 1.0, 'memory': 2218468148.0}

@romilbhardwaj
Copy link
Collaborator

romilbhardwaj commented May 26, 2023

Seems like SkyPilot is trying to use a x86 AMI which is causing launch to fail

(base) ➜  ~ sky launch -c arm -t m7g.xlarge
...
create_instances: Attempt failed with An error occurred (InvalidParameterValue) when calling the RunInstances operation: The architecture 'arm64' of the specified instance type does not match the architecture 'x86_64' of the specified AMI. Specify an instance type and an AMI that have matching architectures, and try again. You can use 'describe-instance-types' or 'describe-images' to discover the architecture of the instance type or AMI., retrying.

Explicitly specifying AMI fails at building psutil, probably because the AMI I tried (Ubuntu 22) doesn't come with gcc ([full log])(https://gist.github.com/romilbhardwaj/2560441fad1eaca47f48f6c430b86073)). Unfortunately there's no nice DL AMI for ARM instances on AWS yet:

$ sky launch -c arm -t m7g.xlarge --image-id ami-0a0c8eebcdd6dcbd0 --region us-east-1 --cloud aws

...
      psutil could not be installed from sources because gcc is not installed. Try running:
        sudo apt-get install gcc python3-dev
      error: command 'aarch64-linux-gnu-gcc' failed: No such file or directory
      [end of output]

  note: This error originates from a subprocess, and is likely not a problem with pip.
  ERROR: Failed building wheel for psutil
Successfully built pendulum
Failed to build psutil
ERROR: Could not build wheels for psutil, which is required to install pyproject.toml-based projects

@Michaelvll
Copy link
Collaborator

Will the graviton DLAMI in the AWS doc work https://docs.aws.amazon.com/dlami/latest/devguide/tutorial-graviton.html?

@github-actions
Copy link

This issue is stale because it has been open 120 days with no activity. Remove stale label or comment or this will be closed in 10 days.

Copy link

This issue is stale because it has been open 120 days with no activity. Remove stale label or comment or this will be closed in 10 days.

@github-actions github-actions bot added the Stale label Jan 24, 2024
Copy link

github-actions bot commented Feb 4, 2024

This issue was closed because it has been stalled for 10 days with no activity.

@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Feb 4, 2024
@Michaelvll Michaelvll removed the Stale label Feb 4, 2024
@Michaelvll Michaelvll reopened this Feb 4, 2024
@github-actions github-actions bot added the Stale label Jun 4, 2024
@skypilot-org skypilot-org deleted a comment from github-actions bot Jun 4, 2024
@github-actions github-actions bot removed the Stale label Jun 5, 2024
Copy link

github-actions bot commented Oct 3, 2024

This issue is stale because it has been open 120 days with no activity. Remove stale label or comment or this will be closed in 10 days.

@github-actions github-actions bot added the Stale label Oct 3, 2024
Copy link

This issue was closed because it has been stalled for 10 days with no activity.

@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Oct 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request Stale
Projects
None yet
Development

No branches or pull requests

4 participants