Skip to content

Commit

Permalink
Update GCP autoscaler to support v4-8 TPU nodes (#30065)
Browse files Browse the repository at this point in the history
TPUv4s are now in Generally Available so there's no reason to exclude them.

When configuring gcp autoscaler tpu nodes, valid accelerator types are whilelisted. TPUv4-8 nodes are not in the whitelist,
even though they behave exactly the same as e.g. TPUv3-8 nodes.

Signed-off-by: GitHub <[email protected]>
  • Loading branch information
zygi authored Nov 9, 2022
1 parent 7563211 commit 45ffe6e
Show file tree
Hide file tree
Showing 3 changed files with 4 additions and 4 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -1207,7 +1207,7 @@ Full configuration
TPU Configuration
~~~~~~~~~~~~~~~~~

It is possible to use `TPU VMs <https://cloud.google.com/tpu/docs/users-guide-tpu-vm>`_ on GCP. Currently, `TPU pods <https://cloud.google.com/tpu/docs/system-architecture-tpu-vm#pods>`_ (TPUs other than v2-8 and v3-8) are not supported.
It is possible to use `TPU VMs <https://cloud.google.com/tpu/docs/users-guide-tpu-vm>`_ on GCP. Currently, `TPU pods <https://cloud.google.com/tpu/docs/system-architecture-tpu-vm#pods>`_ (TPUs other than v2-8, v3-8 and v4-8) are not supported.

Before using a config with TPUs, ensure that the `TPU API is enabled for your GCP project <https://cloud.google.com/tpu/docs/users-guide-tpu-vm#enable_the_cloud_tpu_api>`_.

Expand Down
4 changes: 2 additions & 2 deletions python/ray/autoscaler/_private/gcp/config.py
Original file line number Diff line number Diff line change
Expand Up @@ -69,9 +69,9 @@ def get_node_type(node: dict) -> GCPNodeType:

if "machineType" not in node and "acceleratorType" in node:
# remove after TPU pod support is added!
if node["acceleratorType"] not in ("v2-8", "v3-8"):
if node["acceleratorType"] not in ("v2-8", "v3-8", "v4-8"):
raise ValueError(
"For now, only v2-8' and 'v3-8' accelerator types are "
"For now, only 'v2-8', 'v3-8' and 'v4-8' accelerator types are "
"supported. Support for TPU pods will be added in the future."
)

Expand Down
2 changes: 1 addition & 1 deletion python/ray/autoscaler/gcp/tpu.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@ available_node_types:
min_workers: 7
resources: {"TPU": 1} # use TPU custom resource in your code
node_config:
# Only v2-8 and v3-8 accelerator types are currently supported.
# Only v2-8, v3-8 and v4-8 accelerator types are currently supported.
# Support for TPU pods will be added in the future.
acceleratorType: v2-8
runtimeVersion: v2-alpha
Expand Down

0 comments on commit 45ffe6e

Please sign in to comment.