Add a new End-to-End tutorial in Serve that walks users through deploying a model #20765

shrekris-anyscale · 2021-11-29T18:07:02Z

Why are these changes needed?

Currently, the docs have an end-to-end tutorial walking users through deploying a Counter function on Serve. This PR adds an end-to-end tutorial walking users through deploying an entire Hugging Face model using Serve, providing a better understanding of how to deploy an actual model via Serve.

Related issue number

Closes #19250.

Checks

I've run scripts/format.sh to lint the changes in this PR.
I've included any doc changes needed for https://docs.ray.io/en/master/.
I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
Testing Strategy
- This is a docs PR, so no testing necessary.

…model on Serve

…HuggingFace model contributors

…-updates

simon-mo

This is great! It's right level of details and explanation. Content wise:

trim down the modeling specific code using huggingface pipelines
add fastapi example for typed http requests handling

simon-mo · 2021-11-29T18:13:04Z

doc/source/serve/deploy_model_tutorial.rst

+.. code-block:: python
+
+  from transformers import AutoTokenizer, AutoModelWithLMHead
+
+  def summarize(text, max_length=150):
+    tokenizer = AutoTokenizer.from_pretrained("mrm8488/t5-base-finetuned-summarize-news")
+    model = AutoModelWithLMHead.from_pretrained("mrm8488/t5-base-finetuned-summarize-news")
+
+    input_ids = tokenizer.encode(text, return_tensors="pt", add_special_tokens=True)
+    generated_ids = model.generate(input_ids=input_ids, num_beams=2, max_length=max_length,  repetition_penalty=2.5, length_penalty=1.0, early_stopping=True)
+    preds = [tokenizer.decode(g, skip_special_tokens=True, clean_up_tokenization_spaces=True) for g in generated_ids]
+
+    return preds[0]


can we just use https://huggingface.co/transformers/main_classes/pipelines.html#transformers.SummarizationPipeline to make this one line?

simon-mo · 2021-11-29T18:13:55Z

doc/source/serve/deploy_model_tutorial.rst

+The ``ray`` and ``ray serve`` libraries give us access to Ray Serve's deployments, 
+so we can access our model over HTTP. The requests library handles HTTP request routing:
+
+.. code-block:: python


use literalinclude to make the script testable https://github.com/ray-project/ray/blob/master/doc/source/serve/pipeline.rst#basic-api

simon-mo · 2021-11-29T18:15:27Z

doc/source/serve/deploy_model_tutorial.rst

+
+  ``ray.init()`` will start a single-node Ray cluster on your local machine, 
+  which will allow you to use all your CPU cores to serve requests in parallel. 
+  To start a multi-node cluster, see :doc:`../cluster/index`.


:ref:`serve-deploy-tutorial`

instead, this will link to https://github.com/ray-project/ray/blob/master/doc/source/serve/deployment.rst and provide a stable link

doc/source/serve/deploy_model_tutorial.rst

simon-mo · 2021-12-06T19:51:43Z

@triciasfu @edoakes ping for review!

triciasfu · 2021-12-07T23:20:29Z

doc/source/serve/deploy_model_tutorial.rst

+
+  When the Python script exits, Ray Serve will shut down.  
+  If you would rather keep Ray Serve running in the background you can use 
+  ``serve.start(detached=True)`` (see :doc:`deployment` for details).


We should probably use serve.start(detached=True) as the default in the code and then add a snippet about how users can also use serve.start() [without detached] since thats the primary use case

triciasfu · 2021-12-07T23:28:38Z

doc/source/serve/deploy_model_tutorial.rst

+  When the Python script exits, Ray Serve will shut down.  
+  If you would rather keep Ray Serve running in the background you can use ``serve.start(detached=True)`` (see :doc:`deployment` for details).


Same comment as above - default to detached=True

triciasfu · 2021-12-08T00:15:05Z

doc/source/serve/deploy_model_tutorial.rst

+
+We can achieve this by converting our ``summarize`` function into a class:
+
+.. code-block:: python


We should also include FastAPI into this tutorial!

doc/source/serve/deploy_model_tutorial.rst

triciasfu · 2021-12-08T00:17:41Z

doc/source/serve/deploy_model_tutorial.rst

+``tokenizer`` and ``model`` only once and keep their values in memory instead of 
+reloading them upon each HTTP query.
+
+We can achieve this by converting our ``summarize`` function into a class:


Same comment as above here:
We probably want to add 1-2 sentences on how to run this (i.e. To deploy, first start ray by running ray start --head, then run your python script: python example.py)

…c-updates

Co-authored-by: Simon Mo <[email protected]>

…c-updates

shrekris-anyscale · 2022-01-18T19:43:24Z

doc/source/serve/deploy_model_tutorial.rst

@@ -0,0 +1,387 @@
+====================================
+End-to-End Model Deployment Tutorial


Should this be called "End-to-End Tutorial" instead of "End-to-End Model Deployment Tutorial" to match the current E2E tutorial title?

…c-updates

simon-mo

LGTM. Awesome work

simon-mo · 2022-01-18T21:54:10Z

doc/source/serve/index.rst

@@ -44,23 +44,133 @@ Ray Serve can be used in two primary ways to deploy your models at scale:
 Ray Serve Quickstart
 ====================

-Ray Serve supports Python versions 3.6 through 3.8. To install Ray Serve, run the following command:
+Ray Serve supports Python versions 3.6 through 3.9.


maybe just link to Ray's support matrix: https://docs.ray.io/en/master/installation.html

…c-updates

…orial

…c-updates

shrekris-anyscale added 8 commits November 10, 2021 16:56

Start an in-depth end-to-end tutorial that shows how to deploy an ML …

8c7b01e

…model on Serve

Add an example of a Serve class deployment to the deploy_model_tutorial

b618d12

Add conclusion to deploy_model_tutorial

b0ee78d

Edit grammar and formatting of deploy_model_tutorial

b3417ed

Improve readability of rst file in deploy_model_tutorial

07a3821

Further format the deploy_model_tutorial.rst file and add credits to …

1629dc6

…HuggingFace model contributors

Correct name of one of the contributors

03c60af

Merge branch 'master' of github.com:ray-project/ray into shrekris-doc…

43f440b

…-updates

shrekris-anyscale requested a review from simon-mo November 29, 2021 18:07

simon-mo reviewed Nov 29, 2021

View reviewed changes

simon-mo assigned triciasfu, edoakes and simon-mo Nov 29, 2021

triciasfu reviewed Dec 7, 2021

View reviewed changes

triciasfu reviewed Dec 8, 2021

View reviewed changes

doc/source/serve/deploy_model_tutorial.rst Outdated Show resolved Hide resolved

triciasfu reviewed Dec 8, 2021

View reviewed changes

shrekris-anyscale added 11 commits January 3, 2022 17:22

Merge branch 'master' of github.com:ray-project/ray into e2e-serve-do…

5fb820a

…c-updates

Start moving documentation code to Python files for easy testing

e5a0053

Move deployment Python scripts in documentation to Python files

f08f395

Move client code in e2e documentation to Python script

44366c1

Move e2e class deployment documentation to Python file

9aba376

Adapt e2e tutorial to new literal_include code

0701d0c

Add FastAPI to end-to-end tutorial

e24f1e0

Improve formatting on pip installation line

5c22221

Add more stable link to multi_node cluster documentation

97f732a

Linter

1d39f0c

Break up client code for e2e tutorial into different Python scripts

dd3c1b7

shrekris-anyscale and others added 15 commits January 16, 2022 11:11

Update doc/source/serve/deploy_model_tutorial.rst

40fe01d

Co-authored-by: Simon Mo <[email protected]>

Update doc/source/serve/deploy_model_tutorial.rst

e424160

Co-authored-by: Simon Mo <[email protected]>

Update doc/source/serve/deploy_model_tutorial.rst

a5a0e73

Co-authored-by: Simon Mo <[email protected]>

Add line numbers and sections

160b3f3

Revise line number references

b12be03

Add missing word

ca57f28

Add citation for article text

8f02085

Reformat list

44aaa34

Explaing HTTP schema

47d6237

Move E2E tutorial to the quickstart

f30a12e

Add precise synonym for API

e13b933

Remove old E2E tutorial

0d995fb

Remove blank line in code example

abdb158

Merge branch 'master' of github.com:ray-project/ray into e2e-serve-do…

9e7a2ed

…c-updates

Update version

d37b2ab

shrekris-anyscale commented Jan 18, 2022

View reviewed changes

shrekris-anyscale added 2 commits January 18, 2022 11:51

Move paragraph to a note

d33e14f

Merge branch 'master' of github.com:ray-project/ray into e2e-serve-do…

b72523e

…c-updates

simon-mo approved these changes Jan 18, 2022

View reviewed changes

shrekris-anyscale added 10 commits January 18, 2022 14:57

Move versioning info into a note and link to Ray's installation doc

07b7b22

Move serve deploy tutorial label to top of Deploying Ray Serve tutorial

f662529

Merge branch 'master' of github.com:ray-project/ray into e2e-serve-do…

d5448e1

…c-updates

Link to specific section in deployment tutorial instead of entire tut…

9a08528

…orial

Merge branch 'master' of github.com:ray-project/ray into e2e-serve-do…

68e8e47

…c-updates

Remove references to deprecated E2E tutorial

aa1d06c

Merge branch 'master' of github.com:ray-project/ray into e2e-serve-do…

b90693a

…c-updates

Revert title and reformat footnotes

8011b72

Restore footnotes

ed351c7

Propagate renaming of end-to-end tutorial

6459116

edoakes merged commit 03d93ba into ray-project:master Jan 24, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add a new End-to-End tutorial in Serve that walks users through deploying a model #20765

Add a new End-to-End tutorial in Serve that walks users through deploying a model #20765

shrekris-anyscale commented Nov 29, 2021

simon-mo left a comment

simon-mo Nov 29, 2021

simon-mo Nov 29, 2021

simon-mo Nov 29, 2021

simon-mo commented Dec 6, 2021

triciasfu Dec 7, 2021

triciasfu Dec 7, 2021

triciasfu Dec 8, 2021

triciasfu Dec 8, 2021

shrekris-anyscale Jan 18, 2022

simon-mo left a comment

simon-mo Jan 18, 2022

		When the Python script exits, Ray Serve will shut down.
		If you would rather keep Ray Serve running in the background you can use ``serve.start(detached=True)`` (see :doc:`deployment` for details).


		We can achieve this by converting our ``summarize`` function into a class:

		.. code-block:: python

		@@ -0,0 +1,387 @@
		====================================
		End-to-End Model Deployment Tutorial

Add a new End-to-End tutorial in Serve that walks users through deploying a model #20765

Add a new End-to-End tutorial in Serve that walks users through deploying a model #20765

Conversation

shrekris-anyscale commented Nov 29, 2021

Why are these changes needed?

Related issue number

Checks

simon-mo left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

simon-mo commented Dec 6, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

simon-mo left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment