-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Streaming for serving with chat's generate function #1426
Conversation
This now also works:
import requests
import litserve
print("LitServe:", litserve.__version__)
url = "http://127.0.0.1:8000/predict"
resp = requests.post(url, json={"prompt": "Hello world"}, headers=None, stream=True)
for line in resp.iter_lines():
if line:
print(line.decode("utf-8"))
The only remaining issue is that the stop token terminates everything. Otherwise, it works fine. |
@aniketmaurya Are there any best practices / examples for how to piece the outputs together as a string? It looks very obvious at first glance, and I have tried many things, but it's harder than I thought and requires lots of lines of code because I sometimes get invalid JSON errors. |
Thanks @aniketmaurya , I can now do: import requests
import litserve
import json
print("LitServe:", litserve.__version__)
url = "http://127.0.0.1:8000/predict"
resp = requests.post(url, json={"prompt": "Hello world"}, headers=None, stream=True)
for line in resp.iter_lines():
if line:
print(json.loads(line)["output"], end="") and it works perfectly! |
The tests are failing because there hasn't been a litserve 0.1.1 release yet. It's fine (no rush), I just think we should wait until there's been a new release. |
@lantiga do we want to make a release anytime soon? LitServe recently had a lot of bug fixes in addition to OpenAI spec too since the last release. |
hi @rasbt, can we add a EDIT: Actually, I was able to import it even without that. I got an import error when I did an editable install of LitGPT. |
Thanks for the suggestion, it absolutely makes sense to add it, @aniketmaurya . Just did. |
This is an alternative to #1424 using the
generate
function from chat.