Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Streamed responses incompatible with multiple choices (n>1) #26719

Open
5 tasks done
sabrenner opened this issue Sep 20, 2024 · 2 comments
Open
5 tasks done

Streamed responses incompatible with multiple choices (n>1) #26719

sabrenner opened this issue Sep 20, 2024 · 2 comments
Labels
🤖:bug Related to a bug, vulnerability, unexpected error with an existing feature investigate

Comments

@sabrenner
Copy link

Checked other resources

  • I added a very descriptive title to this issue.
  • I searched the LangChain documentation with the integrated search.
  • I used the GitHub search to find a similar question and didn't find it.
  • I am sure that this is a bug in LangChain rather than my code.
  • The bug is not resolved by updating to the latest stable version of LangChain (or the specific integration package).

Example Code

from langchain_core.output_parsers import StrOutputParser
from langchain_core.prompts import ChatPromptTemplate
from langchain_openai import ChatOpenAI

model = ChatOpenAI(model="gpt-3.5-turbo-0125", n=3)

parser = StrOutputParser()
chain =  model | parser

for chunk in chain.stream(input="tell me a joke about chickens"):
  print(chunk)

Chunks sometimes seem to be printed in a reasonable order, but sometimes it seems nondeterministic.

Error Message and Stack Trace (if applicable)

No response

Description

I'm trying to determine if the langchain streaming API works when multiple choices are specified on the model. I expected there to be some kind of index included with each chunk returned, but I was not able to see or use that. Since chunks seem to be yielded in a non-deterministic order, I'm not sure how to approach using a streamed response from (OpenAI, for example) a chat model or LLM where we specify n>1 as a general config on the model or LLM. Specifically:

  1. In general, is indexing for chunks supported, or planned to be supported? Or, is there another workaround for this?
  2. It does seem that chunks are yielded non-deterministically. Is that accurate?
  3. (langchain_openai specific) I took a look at some source code and noticed that only the first choice is grabbed from the chunk. Is this intentional? If so, is there a reason other choices are disregarded from the chunk?

I also tried seeing if the result emitted to the on_chat_model_end event had the correct generations. However, it doesn't look like it:

{'event': 'on_chat_model_end', 'data': {'output': AIMessageChunk(content="SureSureSure,,, here's here's here's a a a classic chicken classic one joke chicken for for joke you you for:\n\n:\n\n youWhyWhy:\n\n did didWhy the the did chicken chicken the chicken join a band?\n\n go to the seance?\n\n go to the seance?\n\nToToBecause it talk talk had to to the the the drum other othersticks side side!!! 🥁 🐔✨🐔", additional_kwargs={}, response_metadata={'finish_reason': 'stopstopstop', 'model_name': 'gpt-4o-2024-05-13gpt-4o-2024-05-13gpt-4o-2024-05-13', 'system_fingerprint': 'fp_e375328146fp_e375328146fp_e375328146'}, id='run-5b6e89fc-be08-4b8e-984f-a9a71b974e7a'), 'input': {'messages': [[HumanMessage(content='tell me a joke about chickens', additional_kwargs={}, response_metadata={})]]}}, 'run_id': '5b6e89fc-be08-4b8e-984f-a9a71b974e7a', 'name': 'ChatOpenAI', 'tags': ['seq:step:1'], 'metadata': {'ls_provider': 'openai', 'ls_model_name': 'gpt-4o', 'ls_model_type': 'chat', 'ls_temperature': 0.7, 'ls_max_tokens': 50}, 'parent_ids': ['56a188ee-d914-46e6-ba9d-568bfb53c167']}

System Info

System Information

OS: Darwin
OS Version: Darwin Kernel Version 23.6.0
Python Version: 3.10.13 [Clang 15.0.0 (clang-1500.1.0.2.5)]

Package Information

langchain_core: 0.3.2
langchain: 0.3.0
langsmith: 0.1.125
langchain_input_error: Installed. No version info available.
langchain_openai: 0.2.0
langchain_stream: Installed. No version info available.
langchain_text_splitters: 0.3.0
langchain_tools: Installed. No version info available.
langgraph: Installed. No version info available.

Optional packages not installed

langserve

Other Dependencies

aiohttp: 3.10.5
async-timeout: 4.0.3
httpx: 0.27.0
jsonpatch: 1.33
numpy: 1.26.4
openai: 1.47.0
orjson: 3.10.7
packaging: 23.2
pydantic: 2.9.2
PyYAML: 6.0.2
requests: 2.32.3
SQLAlchemy: 2.0.32
tenacity: 8.5.0
tiktoken: 0.7.0
typing-extensions: 4.12.2

@langcarl langcarl bot added the investigate label Sep 20, 2024
@dosubot dosubot bot added the 🤖:bug Related to a bug, vulnerability, unexpected error with an existing feature label Sep 20, 2024
@keenborder786
Copy link
Contributor

  1. I dont't understand what you meant by indexing. Since the generation is done non-deterministically server side of the LLM Provider, I doubt we can do indexing???
  2. Yes
  3. Yes, because the first choice is the best one.

@sabrenner
Copy link
Author

Hi @keenborder786, thanks for your answers on these questions. For indexing, specifically for OpenAI, I'm referring to this spec on their streamed responses API, which specify a choice index for each choice in a given chunk. I'm unsure if other partner libraries have this in their API, but would it be possible to surface this index in langchain-openai?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🤖:bug Related to a bug, vulnerability, unexpected error with an existing feature investigate
Projects
None yet
Development

No branches or pull requests

2 participants