Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug Fix: chat completions API calls need model_id #114

Merged
merged 16 commits into from
Jun 21, 2024

Conversation

tybrs
Copy link
Contributor

@tybrs tybrs commented May 30, 2024

Description

Users must pass model parameter explicitly to use /v1/chat/completions API call:

curl http://localhost:8088/v1/chat/completions \
    -X POST \
    -d '{"model": "meta-llama/Meta-Llama-Guard-2-8B", "messages": [{"role": "user", "content": "Say this is a test!"}]}' \
    -H 'Content-Type: application/json'

Using ChatHuggingFace in langchain this looks like the following:

llm_guard_chat = ChatHuggingFace(llm=llm_guard, model_id=safety_guard_model)
llm_guard_chat.invoke([{"role": "user", "content": input.text}]).content

If you do not add a model_id it will use the _resolve_model_id method to get the model_id but this does not work for locally deployed TGI services (because it uses huggingface_hub list_inference_endpoints function which only lists hugging face endpoints for account). This issue can be found here langchain-ai/langchain#17779. Current implementation errors as follows:

llm_engine_hf= ChatHuggingFace(llm=llm_guard)
^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/langchain_community/chat_models/huggingface.py", line 54, in init
self._resolve_model_id()
File "/usr/local/lib/python3.11/site-packages/langchain_community/chat_models/huggingface.py", line 158, in _resolve_model_id
raise ValueError(
ValueError: Failed to resolve model_id Could not find model id for inference server provided: http://xx.xx.xx.xxx/
Make sure that your Hugging Face token has access to the endpoint.

This PR adds a workaround to this issue by adding a DEFAULT_MODEL variable and a way to try to get model id from /info endpoint. I also moved the ChatHuggingFace to run at start. This should improve latency of API calls by 1-2 seconds.

Issues

n/a.

Type of change

List the type of change like below. Please delete options that are not relevant.

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds new functionality)
  • Breaking change (fix or feature that would break existing design and interface)

Dependencies

langchain-community

Tests

  1. I rain two local TGI services of both LlamaGuard1 and
docker run -p 8087:80 -v $PWD/data:/data  -e HTTPS_PROXY=$https_proxy -e HTTP_PROXY=$https_proxy ghcr.io/huggingface/text-generation-inference --model-id meta-llama/LlamaGuard-7b
docker run -p 8087:80 -v $PWD/data:/data  -e HTTPS_PROXY=$https_proxy -e HTTP_PROXY=$https_proxy ghcr.io/huggingface/text-generation-inference --model-id meta-llama/Meta-Llama-Guard-2-8B
  1. Then I build and ran guardrails-tgi-server
docker build -t opea/gen-ai-comps:guardrails-tgi-server --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/guardrails/langchain/docker/Dockerfile .
docker run -p 9090:9090  -e https_proxy=$https_proxy -e http_proxy=$http_proxy -e no_proxy=$no_proxy -e HUGGINGFACEHUB_API_TOKEN=$HUGGINGFACEHUB_API_TOKEN -e SAFETY_GUARD_ENDPOINT="http://localhost:8088" --network="host" opea/gen-ai-comps:guardrails-tgi-server
  1. Tested with following curl:
$ curl http://localhost:9090/v1/guardrails -X POST   -d '{"text": "i am going to kill you"}'   -H 'Content-Type: application/json'
{"id":"e6a511430e2d5158d2923a4099502945","text":"Violated policies: Violent Crimes, please check your input."}twilbers@mlp-prod-skx-5675:/localdisk/twilbers/docker$

Tyler Wilbers and others added 2 commits May 29, 2024 17:38
@tybrs
Copy link
Contributor Author

tybrs commented Jun 7, 2024

@lvliang-intel Thanks for the review. Should we assign another person for review?

@tybrs
Copy link
Contributor Author

tybrs commented Jun 19, 2024

@dcmiddle and @lvliang-intel Comments addressed and conflict fixed. Thanks!

@tybrs tybrs requested a review from dcmiddle June 20, 2024 23:54
@lvliang-intel lvliang-intel requested review from chensuyue and zehao-intel and removed request for dcmiddle June 21, 2024 01:20
@lvliang-intel lvliang-intel merged commit 88a147d into opea-project:main Jun 21, 2024
7 checks passed
sharanshirodkar7 pushed a commit to sharanshirodkar7/GenAIComps that referenced this pull request Jul 9, 2024
* added default model

Signed-off-by: Tyler Wilbers <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* added instructions for enviornment variable

Signed-off-by: Tyler Wilbers <[email protected]>

* added bash to codeblock

Signed-off-by: Tyler Wilbers <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fixed typo

Signed-off-by: Tyler Wilbers <[email protected]>

---------

Signed-off-by: Tyler Wilbers <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: sharanshirodkar7 <[email protected]>
yogeshmpandey pushed a commit to yogeshmpandey/GenAIComps that referenced this pull request Jul 10, 2024
* added default model

Signed-off-by: Tyler Wilbers <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* added instructions for enviornment variable

Signed-off-by: Tyler Wilbers <[email protected]>

* added bash to codeblock

Signed-off-by: Tyler Wilbers <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fixed typo

Signed-off-by: Tyler Wilbers <[email protected]>

---------

Signed-off-by: Tyler Wilbers <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Yogesh Pandey <[email protected]>
dwhitena pushed a commit to predictionguard/GenAIComps that referenced this pull request Jul 24, 2024
* added default model

Signed-off-by: Tyler Wilbers <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* added instructions for enviornment variable

Signed-off-by: Tyler Wilbers <[email protected]>

* added bash to codeblock

Signed-off-by: Tyler Wilbers <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fixed typo

Signed-off-by: Tyler Wilbers <[email protected]>

---------

Signed-off-by: Tyler Wilbers <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Daniel Whitenack <[email protected]>
lkk12014402 pushed a commit that referenced this pull request Aug 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants