Skip to content

v0.2.0: Multi-LLM Serving, Infinity Backend

Compare
Choose a tag to compare
@hiyouga hiyouga released this 03 Feb 10:21
· 19 commits to main since this release

New features

  • Support deploying multiple language models in a unified server
  • Use infinity backend for embedding service

What's Changed

  • Bump up vllm version to v0.3.0