You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Can tensorrt-llm or how tensorrt-llm support that seprating the prefill stage and decode stage in different GPU or different nodes with self configuration
#2235
Open
GGBond8488 opened this issue
Sep 18, 2024
· 3 comments
such as vllm-project/vllm#2809 and https://github.com/LLMServe/DistServe that had done
reference:https://arxiv.org/pdf/2311.18677
The text was updated successfully, but these errors were encountered: