You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We should enable serving a model through a spec, without having to implement it manually in decode_request and encode_response. The spec (could be more than one) would:
expose a route
implement specific ways of decoding requests and encoding responses
require the API to expose certain kinds of information (e.g. token used)
in a way that is pluggable at the LitServer level (spec=OpenAISpec) and independent from the API implementation itself.
Motivation
We want to make it seamless for users to expose a model using one or more standard specs.
Pitch
I define a LitAPI subclass, call LitServer(api, spec=OpenAISpec, ...) and I will get an v1/chat/completions/ endpoint that behaves like an OpenAI compatible endpoint.
Alternatives
We subclass LitServer and LitAPI, but this would't compose cleanly down the road with other pieces we want to factor out (e.g. kvcache management).
The text was updated successfully, but these errors were encountered:
🚀 Feature
We should enable serving a model through a spec, without having to implement it manually in
decode_request
andencode_response
. The spec (could be more than one) would:in a way that is pluggable at the LitServer level (
spec=OpenAISpec
) and independent from the API implementation itself.Motivation
We want to make it seamless for users to expose a model using one or more standard specs.
Pitch
I define a
LitAPI
subclass, callLitServer(api, spec=OpenAISpec, ...)
and I will get anv1/chat/completions/
endpoint that behaves like an OpenAI compatible endpoint.Alternatives
We subclass
LitServer
andLitAPI
, but this would't compose cleanly down the road with other pieces we want to factor out (e.g. kvcache management).The text was updated successfully, but these errors were encountered: