-
Notifications
You must be signed in to change notification settings - Fork 136
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add output evaluation for guardrails #332
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
|
@lvliang-intel @letonghan This PR should be ready for review. Is there a benchmark for latency with guardrails? I would love to measure the potential effect. |
Signed-off-by: Tyler Wilbers <[email protected]>
8e6a65d
to
4871435
Compare
Signed-off-by: Tyler Wilbers <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
* ChatQnA chinese version Signed-off-by: Yue, Wenjiao <[email protected]> * format chinese response * update chinese format response * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Yue, Wenjiao <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
* add single query input/output guardrails Signed-off-by: Tyler Wilbers <[email protected]> * removed comment Signed-off-by: Tyler Wilbers <[email protected]> --------- Signed-off-by: Tyler Wilbers <[email protected]>
* add single query input/output guardrails Signed-off-by: Tyler Wilbers <[email protected]> * removed comment Signed-off-by: Tyler Wilbers <[email protected]> --------- Signed-off-by: Tyler Wilbers <[email protected]> Signed-off-by: BaoHuiling <[email protected]>
* add single query input/output guardrails Signed-off-by: Tyler Wilbers <[email protected]> * removed comment Signed-off-by: Tyler Wilbers <[email protected]> --------- Signed-off-by: Tyler Wilbers <[email protected]> Signed-off-by: siddhivelankar23 <[email protected]>
* add single query input/output guardrails Signed-off-by: Tyler Wilbers <[email protected]> * removed comment Signed-off-by: Tyler Wilbers <[email protected]> --------- Signed-off-by: Tyler Wilbers <[email protected]>
Description
This PR attempts to reduce latency by allowing safeguard microservice to be at end of DAG and process safety of both input and output with single query. This means you can structure the DAG as follows to reduce latency of RAG flow:
Currently our implementation only supports single input safeguarding. But the chat Messages API (
v1/chat/completion
) will allow for templating for both "user" inputs and "assistant" output. with a single query. This PR adds the ability to send a list of both "user" and "assistant" messages for safeguarding. Since LLM outputs both "prompt and "text" withGeneratedDoc
this means you can feed output directly into guardrails microservice.However it maintains backwards compatibility with following DAG:
It also can be queried with just text:
Issues
n/a
.Type of change
List the type of change like below. Please delete options that are not relevant.
Dependencies
Changed to
hugingface_hub<=0.24.0.
becauseHugingFaceEndpoint
does it assigns bothendpoint_url
orrepo_id
toInferenceClient.model
attribute. But this is a bug sincehugingface_hub>0.24.0.
introduces abase_url
kwarg forInferenceClient
to be used.Tests
Ran the following
Output: