Can tensorrt-llm or how tensorrt-llm support that seprating the prefill stage and decode stage in different GPU or different nodes with self configuration #2235

GGBond8488 · 2024-09-18T03:33:09Z

litaotju · 2024-09-30T13:30:27Z

DistServe is on our roadmap, and will be supported in the following versions

GGBond8488 · 2024-10-07T23:27:38Z

DistServe is on our roadmap, and will be supported in the following versions

Thank you for your reply！ I look forward to the future implementation of this feature, which will be a great improvement！

github-actions · 2024-11-07T02:01:29Z

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 15 days."

lfr-0531 assigned Shixiaowei02 Sep 20, 2024

lfr-0531 added question Further information is requested triaged Issue has been triaged by maintainers labels Sep 20, 2024

github-actions bot added the stale label Nov 7, 2024

Provide feedback