Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add output evaluation for guardrails #332

Merged
merged 4 commits into from
Aug 13, 2024

Conversation

tybrs
Copy link
Contributor

@tybrs tybrs commented Jul 21, 2024

Description

This PR attempts to reduce latency by allowing safeguard microservice to be at end of DAG and process safety of both input and output with single query. This means you can structure the DAG as follows to reduce latency of RAG flow:

        self.megaservice.add(embedding).add(retriever).add(rerank).add(llm).add(guardrails)
        self.megaservice.flow_to(embedding, retriever)
        self.megaservice.flow_to(retriever, rerank)
        self.megaservice.flow_to(rerank, llm)
        self.megaservice.flow_to(llm, guardrails)
        self.gateway = ChatQnAGateway(megaservice=self.megaservice, host="0.0.0.0", port=self.port)

Currently our implementation only supports single input safeguarding. But the chat Messages API (v1/chat/completion) will allow for templating for both "user" inputs and "assistant" output. with a single query. This PR adds the ability to send a list of both "user" and "assistant" messages for safeguarding. Since LLM outputs both "prompt and "text" with GeneratedDoc this means you can feed output directly into guardrails microservice.

curl http://localhost:9090/v1/guardrails \
  -X POST \
  -d '{
    "prompt" : "How do you buy a tiger in the US",
    "text" : "Yes! Buy a tiger in the US.",
    "parameters":{"max_new_tokens":32}
  }

However it maintains backwards compatibility with following DAG:

        self.megaservice.add(guardrail_in).add(embedding).add(retriever).add(rerank).add(llm).add(guardrail_out)
        self.megaservice.flow_to(guardrail_in, embedding)
        self.megaservice.flow_to(embedding, retriever)
        self.megaservice.flow_to(retriever, rerank)
        self.megaservice.flow_to(rerank, llm)
        self.megaservice.flow_to(llm, guardrail_out)
        self.gateway = ChatQnAGateway(megaservice=self.megaservice, host="0.0.0.0", port=self.port)

It also can be queried with just text:

curl http://localhost:9090/v1/guardrails \
  -X POST \
  -d '{
    "text" : "How do you buy a tiger in the US",
    "parameters":{"max_new_tokens":32}
  }

Issues

n/a.

Type of change

List the type of change like below. Please delete options that are not relevant.

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds new functionality)
  • Breaking change (fix or feature that would break existing design and interface)
  • Others (enhancement, documentation, validation, etc.)

Dependencies

Changed to hugingface_hub<=0.24.0. because HugingFaceEndpoint does it assigns both endpoint_url or repo_id to InferenceClient.model attribute. But this is a bug since hugingface_hub>0.24.0. introduces a base_url kwarg for InferenceClient to be used.

Tests

Ran the following

docker run -d --name="guardrails-tgi-server" -p 9090:9090 --ipc=host -e http_proxy=$http_proxy -e https_proxy=$https_proxy -e no_proxy=$no_proxy -e SAFETY_GUARD_ENDPOINT=${SAFETY_GUARD_ENDPOINT} -e HUGGINGFACEHUB_API_TOKEN=${HUGGINGFACEHUB_API_TOKEN} opea/guardrails-tgi:latest
curl http://localhost:9090/v1/guardrails   -X POST   -d '{
    "messages":[{"role": "user", "text" : "How do you buy a tiger in the US?"}],
    "parameters":{"max_new_tokens":32}
  }'   -H 'Content-Type: application/json'

Output:

{"downstream_black_list":[".*"],"id":"6e70dcfa9db13087fc390d84c7869e7a","text":"Violated policies: Violent Crimes, please check your input."}

Copy link

codecov bot commented Jul 21, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Files Coverage Δ
comps/cores/proto/docarray.py 100.00% <100.00%> (ø)

@ashahba ashahba self-requested a review July 23, 2024 17:05
@tybrs
Copy link
Contributor Author

tybrs commented Aug 6, 2024

@lvliang-intel @letonghan This PR should be ready for review. Is there a benchmark for latency with guardrails? I would love to measure the potential effect.

Signed-off-by: Tyler Wilbers <[email protected]>
Copy link
Collaborator

@ashahba ashahba left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

comps/guardrails/langchain/guardrails_tgi_gaudi.py Outdated Show resolved Hide resolved
Copy link
Collaborator

@ashahba ashahba left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

lkk12014402 pushed a commit that referenced this pull request Aug 8, 2024
* ChatQnA chinese version

Signed-off-by: Yue, Wenjiao <[email protected]>

* format chinese response

* update chinese format response

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: Yue, Wenjiao <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
@ashahba ashahba merged commit 62ca5bc into opea-project:main Aug 13, 2024
7 checks passed
BaoHuiling pushed a commit to siddhivelankar23/GenAIComps that referenced this pull request Aug 15, 2024
* add single query input/output guardrails

Signed-off-by: Tyler Wilbers <[email protected]>

* removed comment

Signed-off-by: Tyler Wilbers <[email protected]>

---------

Signed-off-by: Tyler Wilbers <[email protected]>
BaoHuiling pushed a commit to siddhivelankar23/GenAIComps that referenced this pull request Aug 15, 2024
* add single query input/output guardrails

Signed-off-by: Tyler Wilbers <[email protected]>

* removed comment

Signed-off-by: Tyler Wilbers <[email protected]>

---------

Signed-off-by: Tyler Wilbers <[email protected]>
Signed-off-by: BaoHuiling <[email protected]>
tileintel pushed a commit to siddhivelankar23/GenAIComps that referenced this pull request Aug 22, 2024
* add single query input/output guardrails

Signed-off-by: Tyler Wilbers <[email protected]>

* removed comment

Signed-off-by: Tyler Wilbers <[email protected]>

---------

Signed-off-by: Tyler Wilbers <[email protected]>
Signed-off-by: siddhivelankar23 <[email protected]>
sharanshirodkar7 pushed a commit to predictionguard/pg-GenAIComps that referenced this pull request Sep 3, 2024
* add single query input/output guardrails

Signed-off-by: Tyler Wilbers <[email protected]>

* removed comment

Signed-off-by: Tyler Wilbers <[email protected]>

---------

Signed-off-by: Tyler Wilbers <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants