Streaming working with nnsight streaming #60

JadenFiotto-Kaufman · 2024-09-27T05:50:55Z

No description provided.

MichaelRipa · 2024-09-29T12:25:29Z

telemetry/metrics/gauge.py


    def __new__(cls, service: str):
        """Singleton pattern to ensure only one instance of the gauge per service."""
        if service not in cls._instances:
            instance = super(NDIFGauge, cls).__new__(cls)
            instance.service = service
            instance._gauge = instance._initialize_gauge()
-            if service != 'ray':  # Only initialize the network gauge if the service is not 'ray'
+            if (


Why this notation?

MichaelRipa · 2024-09-29T12:31:31Z

ray/deployments/base.py

 import torch
 from minio import Minio
 from pydantic import BaseModel, ConfigDict
 from torch.amp import autocast
-from torch.cuda import (max_memory_allocated, memory_allocated,
-                        reset_peak_memory_stats)
+from torch.cuda import (


It seems like you have specific preferences on how to format things (using parentheses, vertically listing arguments, using double quotes) which differs from how I do things normally. Maybe we should have a custom linter to standardize things and keep the code style consistent?

MichaelRipa · 2024-09-29T12:32:20Z

ray/deployments/base.py

@@ -45,6 +56,8 @@ def __init__(
            secure=False,
        )

+        self.sio = socketio.SimpleClient(reconnection_attempts=10)


Environment variable?

MichaelRipa · 2024-09-29T12:37:14Z

ray/deployments/base.py

+
+            self.sio.connect(
+                f"{self.api_url}?job_id={request.id}",
+                socketio_path="/ws/socket.io",


environment variables? (socketio path and wait timeout)

MichaelRipa · 2024-09-29T12:47:36Z

services/api/src/app.py


-    await _blocking_response(response)
+@sm.on("stream_upload")
+async def stream_upload(session_id: str, value: Dict):


Doesn't seem like session_id is used here (along with many of the sm.on decorators), is this just because the session manager always passes session_id as the first arg?

MichaelRipa · 2024-09-29T12:49:10Z

telemetry/metrics/gauge.py

@@ -31,31 +42,50 @@ class NumericJobStatus(Enum):
        COMPLETED = 4
        LOG = 5
        ERROR = 6
+        STREAM = 7


We might want to add a status for "AUTHENTICATED" between "RECEIVED" and "APPROVED"

MichaelRipa · 2024-09-29T12:51:52Z

ray/deployments/request.py

+            socketio_path="/ws/socket.io",
+            transports=["websocket"],
+            wait_timeout=10,
+        )

    async def __call__(self, request: BackendRequestModel):


async def __call__(self, request: BackendRequestModel) -> None:

MichaelRipa · 2024-09-29T12:56:03Z

ray/deployments/distributed_model.py

@@ -205,17 +205,21 @@ def pre(self, request: BackendRequestModel):

        torch.distributed.barrier()

-    def post(self, request: BackendRequestModel, result: Any):
+    def post(self, *args, **kwargs):


The way you space things here is inconsistent with how you space things in this file is inconsistent with how you space things in the other files (e..g base.py)

MichaelRipa · 2024-09-29T13:10:38Z

services/api/src/app.py

+    params = environ.get("QUERY_STRING")
+    params = dict(x.split("=") for x in params.split("&"))
+
+    if "job_id" in params:


I think params.keys() makes more sense from a readability standpoint. Also, what happens if job_id is not in params? Should it log an error or debug?

MichaelRipa · 2024-09-29T13:14:09Z

services/api/src/app.py

+@sm.on("stream_upload")
+async def stream_upload(session_id: str, value: Dict):
+
+    value_model = StreamValueModel(**value)


You create a pydantic BaseModel instance from the inputed value json just to create a model key? Isn't there a more clear and efficient way of doing this?

MichaelRipa · 2024-09-29T13:24:57Z

ray/deployments/protocols.py

This whole file needs docstrings IMO, it is very abstract. Also, I feel like this should be in schema.

MichaelRipa · 2024-09-29T13:27:34Z

ray/deployments/distributed_model.py

Cannot comment directly becuase it is from a previous PR, but this shouldn't be hardcoded:

@serve.deployment( ray_actor_options={"num_gpus": 1, "num_cpus": 2}, health_check_timeout_s=1200, )

Additionally, I noticed the following in ModelDeployment.__init__():

extra_kwargs={"meta_buffers": False, "patch_llama_scan": False},

Is this 405b specific, in that this logic will not allow you to deploy a non llama distributed model? If so, is there a way to have this passed in only for 405b? Otherwise, should this be indicated more clearly?

…ts in model base.

…fied

Streaming working with nnsight streaming

c59a88f

MichaelRipa reviewed Sep 29, 2024

View reviewed changes

Use redis instead of rabbitmq. Logic for handling websocket disconnec…

afb6492

…ts in model base.

AdamBelfki3 self-requested a review September 30, 2024 14:30

MichaelRipa mentioned this pull request Sep 30, 2024

Docstrings #61

Open

JadenFiotto-Kaufman added 3 commits October 7, 2024 13:44

Set cuda env to be all gpus in base model deployment if no gpus speci…

932d62a

…fied

API environment requires redis not pika (RMQ ) now

2f66083

Merge remote-tracking branch 'origin/dev' into streaming-protocol

c4605ab

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Streaming working with nnsight streaming #60

Streaming working with nnsight streaming #60

JadenFiotto-Kaufman commented Sep 27, 2024

MichaelRipa Sep 29, 2024

MichaelRipa Sep 29, 2024

MichaelRipa Sep 29, 2024

MichaelRipa Sep 29, 2024

MichaelRipa Sep 29, 2024

MichaelRipa Sep 29, 2024

MichaelRipa Sep 29, 2024

MichaelRipa Sep 29, 2024

MichaelRipa Sep 29, 2024 •

edited

Loading

MichaelRipa Sep 29, 2024

MichaelRipa Sep 29, 2024

MichaelRipa Sep 29, 2024

MichaelRipa Sep 29, 2024

Streaming working with nnsight streaming #60

Are you sure you want to change the base?

Streaming working with nnsight streaming #60

Conversation

JadenFiotto-Kaufman commented Sep 27, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

MichaelRipa Sep 29, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

MichaelRipa Sep 29, 2024 •

edited

Loading