-
Notifications
You must be signed in to change notification settings - Fork 413
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(langchain): add support for streamed calls #10672
Conversation
|
Datadog ReportBranch report: ✅ 0 Failed, 592 Passed, 604 Skipped, 18m 53.33s Total duration (17m 40.25s time saved) |
BenchmarksBenchmark execution time: 2024-09-26 14:50:24 Comparing candidate commit a96fff2 in PR branch Found 0 performance improvements and 0 performance regressions! Performance is the same for 371 metrics, 53 unstable metrics. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work! I added a few comments about styling (avoiding nesting when possible for both readability and ironically line count), and some clarification questions but this is great. Awesome job on setting up a test fixture instead of being bogged down by vcrpy!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, mostly clarification questions! For easier tracking, can we add a limitations section to this PR description (i.e. n>1 response tagging, astream_events())? Thanks!
releasenotes/notes/langchain-lcel-stream-calls-bff85c974a72cceb.yaml
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Two last questions, I'll approve to unblock you but please address them (I'll defer to your judgement). Thanks!
## What does this PR do? Builds off of #10672, extending support for `llm.stream`, `chat_model.stream`, and `chain.stream` to LLM Observability. This PR tags those spans appropriately for LLM Observability, and marks them to submit to LLM Observability. ## Checklist - [x] PR author has checked that all the criteria below are met - The PR description includes an overview of the change - The PR description articulates the motivation for the change - The change includes tests OR the PR description describes a testing strategy - The PR description notes risks associated with the change, if any - Newly-added code is easy to change - The change follows the [library release note guidelines](https://ddtrace.readthedocs.io/en/stable/releasenotes.html) - The change includes or references documentation updates if necessary - Backport labels are set (if [applicable](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting)) ## Reviewer Checklist - [x] Reviewer has checked that all the criteria below are met - Title is accurate - All changes are related to the pull request's stated goal - Avoids breaking [API](https://ddtrace.readthedocs.io/en/stable/versioning.html#interfaces) changes - Testing strategy adequately addresses listed risks - Newly-added code is easy to change - Release note makes sense to a user of the library - If necessary, author has acknowledged and discussed the performance implications of this PR as reported in the benchmarks PR comment - Backport labels are set in a manner that is consistent with the [release branch maintenance policy](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting)
What does this PR do?
Adds support for
.(a)stream(...)
calls on LangChain LCEL chains, chat models, and completion (LLM) models. It accomplishes this by:stream
function is calledThere is one caveat to this, where it's possible the last step in a chain's
stream
call is aJSONOutputParser
. In this case, the stream is already concatenated for us, and we use that result instead.A few additional notes:
astream
, aren't actually async functions, they just return async generators. This is reflected in shared patching functions and test snapshots.shared_stream
, which returns a compatible iterable for sync and async stream managers. It utilizeson_span_started
andon_span_finished
functions to coordinate tags to add in the different cases, such as chain, chat, or llm.stream
methods do not invoke the underlyinggenerate
methods we trace, there's not easy path for code re-use. Thus, I tried to mimic the tags we add for the relevantchain.invoke
,model.generate
, andllm.invoke
patched functions.Note: The version of
vcrpy
which we pinned for reduced flakiness did not like streamed calls when using the LangChain library. As such, I introduced a new fixture that returns a stub for the HTTP transport theOpenAI
client uses. Then, the client with that transport specified can be used on thelangchain_openai
instance. This approach uses text files with just the response data, which is why the "cassettes" added for the tests written aren't the usualyaml
format.Limitations
There are a couple of limitations that need to be noted:
astream_events
in this PR. Although it usesastream
under the hood, and it is tested thatastream_events
does not conflict with the support added in this PR, official tracing support will be added in a follow-up PRChecklist
Reviewer Checklist