Skip to content

Commit

Permalink
[Doc] update workspace templates (ray-project#34289)
Browse files Browse the repository at this point in the history
Signed-off-by: Sofian Hnaide <[email protected]>
  • Loading branch information
sofianhnaide authored and vitsai committed Apr 17, 2023
1 parent 1410e21 commit f9f96fb
Show file tree
Hide file tree
Showing 9 changed files with 86 additions and 44 deletions.
28 changes: 28 additions & 0 deletions doc/source/templates/01_batch_inference/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
# Scaling Batch Inference with Ray Data

This template is a quickstart to using [Ray
Data](https://docs.ray.io/en/latest/data/dataset.html) for batch
inference. Ray Data is one of many libraries under the [Ray AI
Runtime](https://docs.ray.io/en/latest/ray-air/getting-started.html).
See [this blog
post](https://www.anyscale.com/blog/model-batch-inference-in-ray-actors-actorpool-and-datasets)
for more information on why and how you should perform batch inference
with Ray!

This template walks through GPU batch prediction on an image dataset
using a PyTorch model, but the framework and data format are there just
to help you build your own application!

At a high level, this template will:

1. [Load your dataset using Ray
Data.](https://docs.ray.io/en/latest/data/creating-datasets.html)
2. [Preprocess your dataset before feeding it to your
model.](https://docs.ray.io/en/latest/data/transforming-datasets.html)
3. [Initialize your model and perform inference on a shard of your
dataset with a remote
actor.](https://docs.ray.io/en/latest/data/transforming-datasets.html#callable-class-udfs)
4. [Save your prediction
results.](https://docs.ray.io/en/latest/data/api/input_output.html)

Start coding by clicking on the Jupyter or VSCode icon above.
26 changes: 26 additions & 0 deletions doc/source/templates/02_many_model_training/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
# Scaling Many Model Training with Ray Tune

This template is a quickstart to using [Ray
Tune](https://docs.ray.io/en/latest/tune/index.html) for batch
inference. Ray Tune is one of many libraries under the [Ray AI
Runtime](https://docs.ray.io/en/latest/ray-air/getting-started.html).
See [this blog
post](https://www.anyscale.com/blog/training-one-million-machine-learning-models-in-record-time-with-ray)
for more information on the benefits of performing many model training
with Ray!

This template walks through time-series forecasting using
`statsforecast`, but the framework and data format can be swapped out
easily \-- they are there just to help you build your own application!

At a high level, this template will:

1. [Define the training function for a single partition of
data.](https://docs.ray.io/en/latest/tune/tutorials/tune-run.html)
2. [Define a Tune search space to run training over many partitions of
data.](https://docs.ray.io/en/latest/tune/tutorials/tune-search-spaces.html)
3. [Extract the best model per dataset partition from the Tune
experiment
output.](https://docs.ray.io/en/latest/tune/examples/tune_analyze_results.html)

Start coding by clicking on the Jupyter or VSCode icon above.
7 changes: 7 additions & 0 deletions doc/source/templates/03_serving_stable_diffusion/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
# Serving a Stable Diffusion Model with Ray Serve

This guide is a quickstart to use [Ray Serve](https://docs.ray.io/en/latest/serve/index.html) for model serving. Ray Serve is one of many libraries under the [Ray AI Runtime](https://docs.ray.io/en/latest/ray-air/getting-started.html).

This template loads a pretrained stable diffusion model from HuggingFace and serves it to a local endpoint as a Ray Serve deployment.

Start coding by clicking on the Jupyter or VSCode icon above.
7 changes: 6 additions & 1 deletion doc/source/templates/configs/anyscale_cluster_env.yaml
Original file line number Diff line number Diff line change
@@ -1 +1,6 @@
docker_image: anyscale/ray-ml:latest-py39
# you can define a custom byod definition e.g.
# byod:
# docker_image: anyscale/ray-ml:latest-py39
# ray_version: nightly
# or define a build_id for existing images, e.g. for "anyscaleray-ml231-py39-gpu"
build_id: "anyscaleray-mlnightly-py39-gpu"
3 changes: 0 additions & 3 deletions doc/source/templates/configs/compute/cpu/aws_large.yaml
Original file line number Diff line number Diff line change
@@ -1,6 +1,3 @@
cloud_id: {{ env["ANYSCALE_CLOUD_ID"] }}
region: us-west-2

# 8 m5.2xlarge nodes --> 64 CPUs
head_node_type:
name: head_node_type
Expand Down
3 changes: 0 additions & 3 deletions doc/source/templates/configs/compute/cpu/gcp_large.yaml
Original file line number Diff line number Diff line change
@@ -1,6 +1,3 @@
cloud_id: {{ env["ANYSCALE_CLOUD_ID"] }}
region: us-west1

# 8 n2-standard-8 nodes --> 64 CPUs
head_node_type:
name: head_node_type
Expand Down
3 changes: 0 additions & 3 deletions doc/source/templates/configs/compute/gpu/aws_large.yaml
Original file line number Diff line number Diff line change
@@ -1,6 +1,3 @@
cloud_id: {{ env["ANYSCALE_CLOUD_ID"] }}
region: us-west-2

# 4 g4dn.4xlarge nodes --> 64 CPUs, 4 GPUs
head_node_type:
name: head_node_type
Expand Down
3 changes: 0 additions & 3 deletions doc/source/templates/configs/compute/gpu/gcp_large.yaml
Original file line number Diff line number Diff line change
@@ -1,6 +1,3 @@
cloud_id: {{ env["ANYSCALE_CLOUD_ID"] }}
region: us-west1

# 4 n1-standard-16-nvidia-tesla-t4-1 nodes --> 64 CPUs, 4 GPUs
head_node_type:
name: head_node_type
Expand Down
50 changes: 19 additions & 31 deletions doc/source/templates/templates.yaml
Original file line number Diff line number Diff line change
@@ -1,37 +1,25 @@
# See README.md for more details.

- name: Scaling Batch Inference with Ray Data
# Update anyscale/backend/workspace-template.yaml
batch-inference-ray-data:
title: Batch Inference
description: Scaling Batch Inference with Ray Data
path: doc/source/templates/01_batch_inference
cluster_env: doc/source/templates/configs/anyscale_cluster_env.yaml
small:
compute_config:
gcp: doc/source/templates/configs/compute/gpu/gcp_small.yaml
aws: doc/source/templates/configs/compute/gpu/aws_small.yaml
large:
compute_config:
gcp: doc/source/templates/configs/compute/gpu/gcp_large.yaml
aws: doc/source/templates/configs/compute/gpu/aws_large.yaml

- name: Scaling Many Model Training with Ray Tune
compute_config:
GCP: doc/source/templates/configs/compute/gpu/gcp_large.yaml
AWS: doc/source/templates/configs/compute/gpu/aws_large.yaml
many-model-training-ray-tune:
title: Many Model Training
description: Scaling Many Model Training with Ray Tune
path: doc/source/templates/02_many_model_training
cluster_env: doc/source/templates/configs/anyscale_cluster_env.yaml
small:
compute_config:
gcp: doc/source/templates/configs/compute/cpu/gcp_small.yaml
aws: doc/source/templates/configs/compute/cpu/aws_small.yaml
large:
compute_config:
gcp: doc/source/templates/configs/compute/cpu/gcp_large.yaml
aws: doc/source/templates/configs/compute/cpu/aws_large.yaml

- name: Serving a Stable Diffusion Model with Ray Serve
compute_config:
GCP: doc/source/templates/configs/compute/cpu/gcp_large.yaml
AWS: doc/source/templates/configs/compute/cpu/aws_large.yaml
serve-stable-diffusion-model-ray-serve:
title: Serving Stable Diffusion
description: Serving a Stable Diffusion Model with Ray Serve
path: doc/source/templates/03_serving_stable_diffusion
cluster_env: doc/source/templates/configs/anyscale_cluster_env.yaml
small:
compute_config:
gcp: doc/source/templates/configs/compute/cpu/gcp_small.yaml
aws: doc/source/templates/configs/compute/cpu/aws_small.yaml
large:
compute_config:
gcp: doc/source/templates/configs/compute/cpu/gcp_large.yaml
aws: doc/source/templates/configs/compute/cpu/aws_large.yaml
compute_config:
GCP: doc/source/templates/configs/compute/cpu/gcp_large.yaml
AWS: doc/source/templates/configs/compute/cpu/aws_large.yaml

0 comments on commit f9f96fb

Please sign in to comment.