Skip to content

Commit

Permalink
[Serve] Migrate from Flask.Request to Starlette Request (#12852)
Browse files Browse the repository at this point in the history
  • Loading branch information
architkulkarni authored Dec 21, 2020
1 parent 5b48480 commit 8b4b4bf
Show file tree
Hide file tree
Showing 26 changed files with 140 additions and 174 deletions.
37 changes: 15 additions & 22 deletions doc/source/serve/faq.rst
Original file line number Diff line number Diff line change
Expand Up @@ -117,33 +117,33 @@ policies <serve-split-traffic>`, finding the next available replica, and
batching requests together.

When the request arrives in the model, you can access the data similarly to how
you would with HTTP request. Here are some examples how ServeRequest mirrors Flask.Request:
you would with HTTP request. Here are some examples how ServeRequest mirrors Starlette.Request:

.. list-table::
:header-rows: 1

* - HTTP
- ServeHandle
- | Request
| (Flask.Request and ServeRequest)
| (Starlette.Request and ServeRequest)
* - ``requests.get(..., headers={...})``
- ``handle.options(http_headers={...})``
- ``request.headers``
* - ``requests.post(...)``
- ``handle.options(http_method="POST")``
- ``requests.method``
* - ``request.get(..., json={...})``
- ``request.method``
* - ``requests.get(..., json={...})``
- ``handle.remote({...})``
- ``request.json``
* - ``request.get(..., form={...})``
- ``await request.json()``
* - ``requests.get(..., form={...})``
- ``handle.remote({...})``
- ``request.form``
* - ``request.get(..., params={"a":"b"})``
- ``await request.form()``
* - ``requests.get(..., params={"a":"b"})``
- ``handle.remote(a="b")``
- ``request.args``
* - ``request.get(..., data="long string")``
- ``request.query_params``
* - ``requests.get(..., data="long string")``
- ``handle.remote("long string")``
- ``request.data``
- ``await request.body()``
* - ``N/A``
- ``handle.remote(python_object)``
- ``request.data``
Expand All @@ -157,9 +157,9 @@ you would with HTTP request. Here are some examples how ServeRequest mirrors Fla

.. code-block:: python
import flask
import starlette.requests
if isinstance(request, flask.Request):
if isinstance(request, starlette.requests.Request):
print("Request coming from web!")
elif isinstance(request, ServeRequest):
print("Request coming from Python!")
Expand All @@ -170,10 +170,10 @@ you would with HTTP request. Here are some examples how ServeRequest mirrors Fla

.. code-block:: python
handle.remote(flask_request)
handle.remote(starlette_request)
In this case, Serve will `not` wrap it in ServeRequest. You can directly
process the request as a ``flask.Request``.
process the request as a ``starlette.requests.Request``.

How fast is Ray Serve?
----------------------
Expand All @@ -187,13 +187,6 @@ You can checkout our `microbenchmark instruction <https://github.com/ray-project
to benchmark on your hardware.


Does Ray Serve use Flask?
-------------------------
Flask is only used as a web request object for servable to consume the data.
We actually use the fastest Python web server: `Uvicorn <https://www.uvicorn.org/>`_ as our web server,
alongside with the power of Python asyncio.
**Flask is ONLY the request object that we are using, Uvicorn (not flask) provides the webserver.**

Can I use asyncio along with Ray Serve?
---------------------------------------
Yes! You can make your servable methods ``async def`` and Serve will run them
Expand Down
3 changes: 3 additions & 0 deletions doc/source/serve/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,9 @@ Since Serve is built on Ray, it also allows you to scale to many machines, in yo
If you want to try out Serve, join our `community slack <https://forms.gle/9TSdDYUgxYs8SA9e8>`_
and discuss in the #serve channel.

.. note::
Starting with Ray version 1.3.0, Ray Serve backends must take in a Starlette Request object instead of a Flask Request object.
See the `migration guide <https://docs.google.com/document/d/1CG4y5WTTc4G_MRQGyjnb_eZ7GK3G9dUX6TNLKLnKRAc/edit?usp=sharing>`_ for details.

Installation
============
Expand Down
8 changes: 3 additions & 5 deletions doc/source/serve/key-concepts.rst
Original file line number Diff line number Diff line change
Expand Up @@ -19,10 +19,8 @@ Backends
Backends define the implementation of your business logic or models that will handle requests when queries come in to :ref:`serve-endpoint`.
In order to support seamless scalability backends can have many replicas, which are individual processes running in the Ray cluster to handle requests.
To define a backend, first you must define the "handler" or the business logic you'd like to respond with.
The handler should take as input a `Flask Request object <https://flask.palletsprojects.com/en/1.1.x/api/?highlight=request#flask.Request>`_.
The handler should return any JSON-serializable object as output. For a more customizable response type, the handler may return a
The handler should take as input a `Starlette Request object <https://www.starlette.io/requests/>`_ and return any JSON-serializable object as output. For a more customizable response type, the handler may return a
`Starlette Response object <https://www.starlette.io/responses/>`_.
In the future, Ray Serve will support `Starlette Request objects <https://www.starlette.io/requests/>`_ as input as well.

A backend is defined using :mod:`client.create_backend <ray.serve.api.Client.create_backend>`, and the implementation can be defined as either a function or a class.
Use a function when your response is stateless and a class when you might need to maintain some state (like a model).
Expand All @@ -32,15 +30,15 @@ A backend consists of a number of *replicas*, which are individual copies of the

.. code-block:: python
def handle_request(flask_request):
def handle_request(starlette_request):
return "hello world"
class RequestHandler:
# Take the message to return as an argument to the constructor.
def __init__(self, msg):
self.msg = msg
def __call__(self, flask_request):
def __call__(self, starlette_request):
return self.msg
client.create_backend("simple_backend", handle_request)
Expand Down
2 changes: 1 addition & 1 deletion doc/source/serve/package-ref.rst
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ Handle API
:members: remote, options

When calling from Python, the backend implementation will receive ``ServeRequest``
objects instead of Flask requests.
objects instead of Starlette requests.

.. autoclass:: ray.serve.utils.ServeRequest
:members:
Expand Down
10 changes: 5 additions & 5 deletions doc/source/serve/tutorials/batch.rst
Original file line number Diff line number Diff line change
Expand Up @@ -30,13 +30,13 @@ You can use the ``@serve.accept_batch`` decorator to annotate a function or a cl
This annotation is needed because batched backends have different APIs compared
to single request backends. In a batched backend, the inputs are a list of values.

For single query backend, the input type is a single Flask request or
For single query backend, the input type is a single Starlette request or
:mod:`ServeRequest <ray.serve.utils.ServeRequest>`:

.. code-block:: python
def single_request(
request: Union[Flask.Request, ServeRequest],
request: Union[starlette.requests.Request, ServeRequest],
):
pass
Expand All @@ -47,7 +47,7 @@ types:
@serve.accept_batch
def batched_request(
request: List[Union[Flask.Request, ServeRequest]],
request: List[Union[starlette.requests.Request, ServeRequest]],
):
pass
Expand Down Expand Up @@ -84,8 +84,8 @@ Ray Serve was able to evaluate them in batches.

What if you want to evaluate a whole batch in Python? Ray Serve allows you to send
queries via the Python API. A batch of queries can either come from the web server
or the Python API. Requests coming from the Python API will have the similar API
as Flask.Request. See more on the API :ref:`here<serve-handle-explainer>`.
or the Python API. Requests coming from the Python API will have a similar API
to Starlette Request. See more on the API :ref:`here<serve-handle-explainer>`.

.. literalinclude:: ../../../../python/ray/serve/examples/doc/tutorial_batch.py
:start-after: __doc_define_servable_v1_begin__
Expand Down
2 changes: 1 addition & 1 deletion python/ray/serve/examples/doc/quickstart_class.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ class Counter:
def __init__(self):
self.count = 0

def __call__(self, flask_request):
def __call__(self, starlette_request):
self.count += 1
return {"current_counter": self.count}

Expand Down
4 changes: 2 additions & 2 deletions python/ray/serve/examples/doc/quickstart_function.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,8 +6,8 @@
client = serve.start()


def echo(flask_request):
return "hello " + flask_request.args.get("name", "serve!")
def echo(starlette_request):
return "hello " + starlette_request.query_params.get("name", "serve!")


client.create_backend("hello", echo)
Expand Down
10 changes: 5 additions & 5 deletions python/ray/serve/examples/doc/snippet_model_composition.py
Original file line number Diff line number Diff line change
Expand Up @@ -16,13 +16,13 @@


def model_one(request):
print("Model 1 called with data ", request.args.get("data"))
print("Model 1 called with data ", request.query_params.get("data"))
return random()


def model_two(request):
print("Model 2 called with data ", request.args.get("data"))
return request.args.get("data")
print("Model 2 called with data ", request.query_params.get("data"))
return request.query_params.get("data")


class ComposedModel:
Expand All @@ -32,8 +32,8 @@ def __init__(self):
self.model_two = client.get_handle("model_two")

# This method can be called concurrently!
async def __call__(self, flask_request):
data = flask_request.data
async def __call__(self, starlette_request):
data = await starlette_request.body()

score = await self.model_one.remote(data=data)
if score > 0.5:
Expand Down
8 changes: 5 additions & 3 deletions python/ray/serve/examples/doc/tutorial_batch.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,8 +14,10 @@

# __doc_define_servable_v0_begin__
@serve.accept_batch
def batch_adder_v0(flask_requests: List):
numbers = [int(request.args["number"]) for request in flask_requests]
def batch_adder_v0(starlette_requests: List):
numbers = [
int(request.query_params["number"]) for request in starlette_requests
]

input_array = np.array(numbers)
print("Our input array has shape:", input_array.shape)
Expand Down Expand Up @@ -58,7 +60,7 @@ def send_query(number):
# __doc_define_servable_v1_begin__
@serve.accept_batch
def batch_adder_v1(requests: List):
numbers = [int(request.args["number"]) for request in requests]
numbers = [int(request.query_params["number"]) for request in requests]
input_array = np.array(numbers)
print("Our input array has shape:", input_array.shape)
# Sleep for 200ms, this could be performing CPU intensive computation
Expand Down
12 changes: 6 additions & 6 deletions python/ray/serve/examples/doc/tutorial_deploy.py
Original file line number Diff line number Diff line change
Expand Up @@ -48,9 +48,9 @@ def __init__(self):
with open("/tmp/iris_labels.json") as f:
self.label_list = json.load(f)

def __call__(self, flask_request):
payload = flask_request.json
print("Worker: received flask request with data", payload)
async def __call__(self, starlette_request):
payload = await starlette_request.json()
print("Worker: received starlette request with data", payload)

input_vector = [
payload["sepal length"],
Expand Down Expand Up @@ -143,9 +143,9 @@ def __init__(self):
with open("/tmp/iris_labels_2.json") as f:
self.label_list = json.load(f)

def __call__(self, flask_request):
payload = flask_request.json
print("Worker: received flask request with data", payload)
async def __call__(self, starlette_request):
payload = await starlette_request.json()
print("Worker: received starlette request with data", payload)

input_vector = [
payload["sepal length"],
Expand Down
4 changes: 2 additions & 2 deletions python/ray/serve/examples/doc/tutorial_pytorch.py
Original file line number Diff line number Diff line change
Expand Up @@ -27,8 +27,8 @@ def __init__(self):
mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),
])

def __call__(self, flask_request):
image_payload_bytes = flask_request.data
async def __call__(self, starlette_request):
image_payload_bytes = await starlette_request.body()
pil_image = Image.open(BytesIO(image_payload_bytes))
print("[1/3] Parsed image data: {}".format(pil_image))

Expand Down
6 changes: 3 additions & 3 deletions python/ray/serve/examples/doc/tutorial_sklearn.py
Original file line number Diff line number Diff line change
Expand Up @@ -54,9 +54,9 @@ def __init__(self):
with open(LABEL_PATH) as f:
self.label_list = json.load(f)

def __call__(self, flask_request):
payload = flask_request.json
print("Worker: received flask request with data", payload)
async def __call__(self, starlette_request):
payload = await starlette_request.json()
print("Worker: received starlette request with data", payload)

input_vector = [
payload["sepal length"],
Expand Down
4 changes: 2 additions & 2 deletions python/ray/serve/examples/doc/tutorial_tensorflow.py
Original file line number Diff line number Diff line change
Expand Up @@ -51,10 +51,10 @@ def __init__(self, model_path):
self.model_path = model_path
self.model = tf.keras.models.load_model(model_path)

def __call__(self, flask_request):
async def __call__(self, starlette_request):
# Step 1: transform HTTP request -> tensorflow input
# Here we define the request schema to be a json array.
input_array = np.array(flask_request.json["array"])
input_array = np.array((await starlette_request.json())["array"])
reshaped_array = input_array.reshape((1, 28, 28))

# Step 2: tensorflow input -> tensorflow output
Expand Down
4 changes: 2 additions & 2 deletions python/ray/serve/examples/echo.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,8 +9,8 @@
from ray import serve


def echo(flask_request):
return ["hello " + flask_request.args.get("name", "serve!")]
def echo(starlette_request):
return ["hello " + starlette_request.query_params.get("name", "serve!")]


client = serve.start()
Expand Down
7 changes: 4 additions & 3 deletions python/ray/serve/examples/echo_actor.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
"""
Example actor that adds an increment to a number. This number can
come from either web (parsing Flask request) or python call.
come from either web (parsing Starlette request) or python call.
This actor can be called from HTTP as well as from Python.
"""
Expand Down Expand Up @@ -30,9 +30,10 @@ class MagicCounter:
def __init__(self, increment):
self.increment = increment

def __call__(self, flask_request, base_number=None):
def __call__(self, starlette_request, base_number=None):
if serve.context.web:
base_number = int(flask_request.args.get("base_number", "0"))
base_number = int(
starlette_request.query_params.get("base_number", "0"))
return base_number + self.increment


Expand Down
9 changes: 5 additions & 4 deletions python/ray/serve/examples/echo_actor_batch.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
"""
Example actor that adds an increment to a number. This number can
come from either web (parsing Flask request) or python call.
come from either web (parsing Starlette request) or python call.
The queries incoming to this actor are batched.
This actor can be called from HTTP as well as from Python.
"""
Expand Down Expand Up @@ -31,12 +31,13 @@ def __init__(self, increment):
self.increment = increment

@serve.accept_batch
def __call__(self, flask_request_list, base_number=None):
def __call__(self, starlette_request_list, base_number=None):
# batch_size = serve.context.batch_size
if serve.context.web:
result = []
for flask_request in flask_request_list:
base_number = int(flask_request.args.get("base_number", "0"))
for starlette_request in starlette_request_list:
base_number = int(
starlette_request.query_params.get("base_number", "0"))
result.append(base_number)
return list(map(lambda x: x + self.increment, result))
else:
Expand Down
2 changes: 1 addition & 1 deletion python/ray/serve/examples/echo_batching.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ def __init__(self, increment):
self.increment = increment

@serve.accept_batch
def __call__(self, flask_request, base_number=None):
def __call__(self, starlette_request, base_number=None):
# __call__ fn should preserve the batch size
# base_number is a python list

Expand Down
6 changes: 3 additions & 3 deletions python/ray/serve/examples/echo_full.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,8 +12,8 @@

# a backend can be a function or class.
# it can be made to be invoked from web as well as python.
def echo_v1(flask_request):
response = flask_request.args.get("response", "web")
def echo_v1(starlette_request):
response = starlette_request.query_params.get("response", "web")
return response


Expand All @@ -32,7 +32,7 @@ def echo_v1(flask_request):


# We can also add a new backend and split the traffic.
def echo_v2(flask_request):
def echo_v2(starlette_request):
# magic, only from web.
return "something new"

Expand Down
Loading

0 comments on commit 8b4b4bf

Please sign in to comment.