Streamed responses incompatible with multiple choices (`n>1`) #26719

sabrenner · 2024-09-20T19:45:14Z

Checked other resources

I added a very descriptive title to this issue.
I searched the LangChain documentation with the integrated search.
I used the GitHub search to find a similar question and didn't find it.
I am sure that this is a bug in LangChain rather than my code.
The bug is not resolved by updating to the latest stable version of LangChain (or the specific integration package).

Example Code

from langchain_core.output_parsers import StrOutputParser
from langchain_core.prompts import ChatPromptTemplate
from langchain_openai import ChatOpenAI

model = ChatOpenAI(model="gpt-3.5-turbo-0125", n=3)

parser = StrOutputParser()
chain =  model | parser

for chunk in chain.stream(input="tell me a joke about chickens"):
  print(chunk)

Chunks sometimes seem to be printed in a reasonable order, but sometimes it seems nondeterministic.

Error Message and Stack Trace (if applicable)

No response

Description

I'm trying to determine if the langchain streaming API works when multiple choices are specified on the model. I expected there to be some kind of index included with each chunk returned, but I was not able to see or use that. Since chunks seem to be yielded in a non-deterministic order, I'm not sure how to approach using a streamed response from (OpenAI, for example) a chat model or LLM where we specify n>1 as a general config on the model or LLM. Specifically:

In general, is indexing for chunks supported, or planned to be supported? Or, is there another workaround for this?
It does seem that chunks are yielded non-deterministically. Is that accurate?
(langchain_openai specific) I took a look at some source code and noticed that only the first choice is grabbed from the chunk. Is this intentional? If so, is there a reason other choices are disregarded from the chunk?

I also tried seeing if the result emitted to the on_chat_model_end event had the correct generations. However, it doesn't look like it:

{'event': 'on_chat_model_end', 'data': {'output': AIMessageChunk(content="SureSureSure,,, here's here's here's a a a classic chicken classic one joke chicken for for joke you you for:\n\n:\n\n youWhyWhy:\n\n did didWhy the the did chicken chicken the chicken join a band?\n\n go to the seance?\n\n go to the seance?\n\nToToBecause it talk talk had to to the the the drum other othersticks side side!!! 🥁 🐔✨🐔", additional_kwargs={}, response_metadata={'finish_reason': 'stopstopstop', 'model_name': 'gpt-4o-2024-05-13gpt-4o-2024-05-13gpt-4o-2024-05-13', 'system_fingerprint': 'fp_e375328146fp_e375328146fp_e375328146'}, id='run-5b6e89fc-be08-4b8e-984f-a9a71b974e7a'), 'input': {'messages': [[HumanMessage(content='tell me a joke about chickens', additional_kwargs={}, response_metadata={})]]}}, 'run_id': '5b6e89fc-be08-4b8e-984f-a9a71b974e7a', 'name': 'ChatOpenAI', 'tags': ['seq:step:1'], 'metadata': {'ls_provider': 'openai', 'ls_model_name': 'gpt-4o', 'ls_model_type': 'chat', 'ls_temperature': 0.7, 'ls_max_tokens': 50}, 'parent_ids': ['56a188ee-d914-46e6-ba9d-568bfb53c167']}

System Info

System Information

OS: Darwin
OS Version: Darwin Kernel Version 23.6.0
Python Version: 3.10.13 [Clang 15.0.0 (clang-1500.1.0.2.5)]

Package Information

langchain_core: 0.3.2
langchain: 0.3.0
langsmith: 0.1.125
langchain_input_error: Installed. No version info available.
langchain_openai: 0.2.0
langchain_stream: Installed. No version info available.
langchain_text_splitters: 0.3.0
langchain_tools: Installed. No version info available.
langgraph: Installed. No version info available.

Optional packages not installed

langserve

Other Dependencies

aiohttp: 3.10.5
async-timeout: 4.0.3
httpx: 0.27.0
jsonpatch: 1.33
numpy: 1.26.4
openai: 1.47.0
orjson: 3.10.7
packaging: 23.2
pydantic: 2.9.2
PyYAML: 6.0.2
requests: 2.32.3
SQLAlchemy: 2.0.32
tenacity: 8.5.0
tiktoken: 0.7.0
typing-extensions: 4.12.2

The text was updated successfully, but these errors were encountered:

keenborder786 · 2024-09-21T00:37:12Z

I dont't understand what you meant by indexing. Since the generation is done non-deterministically server side of the LLM Provider, I doubt we can do indexing???
Yes
Yes, because the first choice is the best one.

sabrenner · 2024-09-23T14:52:00Z

Hi @keenborder786, thanks for your answers on these questions. For indexing, specifically for OpenAI, I'm referring to this spec on their streamed responses API, which specify a choice index for each choice in a given chunk. I'm unsure if other partner libraries have this in their API, but would it be possible to surface this index in langchain-openai?

langcarl bot added the investigate label Sep 20, 2024

dosubot bot added the 🤖:bug Related to a bug, vulnerability, unexpected error with an existing feature label Sep 20, 2024

sabrenner mentioned this issue Sep 24, 2024

feat(langchain): add support for streamed calls DataDog/dd-trace-py#10672

Open

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Streamed responses incompatible with multiple choices (`n>1`) #26719

Streamed responses incompatible with multiple choices (`n>1`) #26719

sabrenner commented Sep 20, 2024

keenborder786 commented Sep 21, 2024

sabrenner commented Sep 23, 2024

Streamed responses incompatible with multiple choices (n>1) #26719

Streamed responses incompatible with multiple choices (n>1) #26719

Comments

sabrenner commented Sep 20, 2024

Checked other resources

Example Code

Error Message and Stack Trace (if applicable)

Description

System Info

System Information

Package Information

Optional packages not installed

Other Dependencies

keenborder786 commented Sep 21, 2024

sabrenner commented Sep 23, 2024

Streamed responses incompatible with multiple choices (`n>1`) #26719

Streamed responses incompatible with multiple choices (`n>1`) #26719