You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on May 28, 2024. It is now read-only.
I was trying to run llama-2 on a machine with V100 GPU.
I ran aviary run --model ~/models/continuous_batching/meta-llama--Llama-2-7b-chat-hf.yaml inside the docker container and got a stack trace:
(HTTPProxyActor pid=2448) INFO 2023-08-22 23:38:44,774 http_proxy 172.17.0.2 http_proxy.py:904 - Proxy actor f5a0692e60801e1b0ef45a8301000000 starting on node 57297f3255438333c74bdc7b75d3fd3aa4b1c48e7bdcf6d07db72a41.
[INFO 2023-08-22 23:38:44,824] api.py: 320 Started detached Serve instance in namespace "serve".
(HTTPProxyActor pid=2448) INFO: Started server process [2448]
[INFO 2023-08-22 23:38:44,951] api.py: 300 Connecting to existing Serve app in namespace "serve". New http options will not be applied.
(ServeController pid=2420) INFO 2023-08-22 23:38:44,942 controller 2420 deployment_state.py:1319 - Deploying new version of deployment meta-llama--Llama-2-7b-chat-hf_meta-llama--Llama-2-7b-chat-hf.
(ServeController pid=2420) INFO 2023-08-22 23:38:45,046 controller 2420 deployment_state.py:1586 - Adding 1 replica to deployment meta-llama--Llama-2-7b-chat-hf_meta-llama--Llama-2-7b-chat-hf.
(ServeController pid=2420) INFO 2023-08-22 23:38:45,083 controller 2420 deployment_state.py:1319 - Deploying new version of deployment router_Router.
(ServeController pid=2420) INFO 2023-08-22 23:38:45,187 controller 2420 deployment_state.py:1586 - Adding 2 replicas to deployment router_Router.
(ServeReplica:router_Router pid=2480) There was a problem when trying to write in your cache folder (/home/jupyter/cache/data/hub). You should set the environment variable TRANSFORMERS_CACHE to a writable directory.
(autoscaler +15s) Tip: use `ray status` to view detailed cluster status. To disable these messages, set RAY_SCHEDULER_EVENTS=0.
(autoscaler +15s) Error: No available node types can fulfill resource request {'accelerator_type_a10': 0.01, 'CPU': 1.0}. Add suitable node types to this cluster to resolve this issue.
(ServeController pid=2420) WARNING 2023-08-22 23:39:15,112 controller 2420 deployment_state.py:1889 - Deployment "meta-llama--Llama-2-7b-chat-hf_meta-llama--Llama-2-7b-chat-hf" has 1 replicas that have taken more than 30s to be scheduled. This may be caused by waiting for the cluster to auto-scale, or waiting for a runtime environment to install. Resources required for each replica: {"accelerator_type_a10": 0.01, "CPU": 1}, resources available: {"CPU": 14.0}.
(ServeReplica:router_Router pid=2479) There was a problem when trying to write in your cache folder (/home/jupyter/cache/data/hub). You should set the environment variable TRANSFORMERS_CACHE to a writable directory.
(autoscaler +50s) Error: No available node types can fulfill resource request {'accelerator_type_a10': 0.01, 'CPU': 1.0}. Add suitable node types to this cluster to resolve this issue.
Is aviary incompatible with V100 GPUs?
The text was updated successfully, but these errors were encountered:
I was trying to run llama-2 on a machine with V100 GPU.
I ran
aviary run --model ~/models/continuous_batching/meta-llama--Llama-2-7b-chat-hf.yaml
inside the docker container and got a stack trace:Is aviary incompatible with V100 GPUs?
The text was updated successfully, but these errors were encountered: