Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for LiteLLM #56

Closed
priamai opened this issue Nov 24, 2023 · 10 comments
Closed

Add support for LiteLLM #56

priamai opened this issue Nov 24, 2023 · 10 comments
Labels
enhancement New feature or request

Comments

@priamai
Copy link

priamai commented Nov 24, 2023

Hi there,
I am getting familiar with the source code but I want to have the ability to change the settings of the embeddings and generation to an OpenAI PROXY server: https://docs.litellm.ai/docs/simple_proxy

We would just need 2 environment settings, the first one is the API KEY you already have it and the second one is the URL to point the OpenAI class.

openai.api_key = "anything"             # this can be anything, we set the key on the proxy
openai.api_base = "http://0.0.0.0:8000" # set api base to the proxy from step 1

Those are exactly the same as the library as you would use it, of course normally the api_base poins to Azure:

import openai

# optional; defaults to `os.environ['OPENAI_API_KEY']`
openai.api_key = '...'

# all client options can be configured just like the `OpenAI` instantiation counterpart
openai.base_url = "https://..."
openai.default_headers = {"x-foo": "true"}

Let me know.
Cheers!

@thomashacker
Copy link
Collaborator

Great idea! We'll add this

@thomashacker thomashacker added the enhancement New feature or request label Nov 28, 2023
@ishaan-jaff
Copy link

i'm the maintainer of LiteLLM, let me know if you run into any issues

@thomashacker
Copy link
Collaborator

@ishaan-jaff Thanks a lot! Just to clarify, we simply need two new environment variables to be able to use LiteLLM, correct? 😄

@priamai
Copy link
Author

priamai commented Nov 30, 2023

We would need also another UI dropbox, because with LiteLLM you can choose dynamically which backend you want to call, even though the interface is OpenAI but you can load for example a LLama2 model.
I think having that in the UI will be better instead of an environment variable which will be annoying.

@priamai
Copy link
Author

priamai commented Nov 30, 2023

Example below:

import openai # openai v1.0.0+
client = openai.OpenAI(api_key="anything",base_url="http://0.0.0.0:8000") # set proxy to base_url
# request sent to model set on litellm proxy, `litellm --model`
response = client.chat.completions.create(model="gpt-3.5-turbo", messages = [
    {
        "role": "user",
        "content": "this is a test request, write a short poem"
    }
])

print(response)

But model can be anything defined by the user in his model list:

model_list = [{ # list of model deployments 
    "model_name": "gpt-3.5-turbo", # model alias 
    "litellm_params": { # params for litellm completion/embedding call 
        "model": "azure/chatgpt-v-2", # actual model name
        "api_key": os.getenv("AZURE_API_KEY"),
        "api_version": os.getenv("AZURE_API_VERSION"),
        "api_base": os.getenv("AZURE_API_BASE")
    }
}, {
    "model_name": "gpt-3.5-turbo", 
    "litellm_params": { # params for litellm completion/embedding call 
        "model": "azure/chatgpt-functioncalling", 
        "api_key": os.getenv("AZURE_API_KEY"),
        "api_version": os.getenv("AZURE_API_VERSION"),
        "api_base": os.getenv("AZURE_API_BASE")
    }
}, {
    "model_name": "gpt-3.5-turbo", 
    "litellm_params": { # params for litellm completion/embedding call 
        "model": "gpt-3.5-turbo", 
        "api_key": os.getenv("OPENAI_API_KEY"),
    }
}]

@thomashacker
Copy link
Collaborator

Great, thanks for the info! We'll look into it

@priamai
Copy link
Author

priamai commented Dec 4, 2023

Great, thanks for the info! We'll look into it

I suggest the best approach to be the following:
a) the user enable the LiteLLM proxy via an environment variable
b) your backend calls the LiteLLM API (GET /models) to list the available models
c) the front UI has a selection box to select the models available.

API:

Server Endpoints
POST /chat/completions - chat completions endpoint to call 100+ LLMs
POST /completions - completions endpoint
POST /embeddings - embedding endpoint for Azure, OpenAI, Huggingface endpoints
GET /models - available models on server
POST /key/generate - generate a key to access the proxy

@ishaan-jaff
Copy link

@ishaan-jaff Thanks a lot! Just to clarify, we simply need two new environment variables to be able to use LiteLLM, correct?

One variable*
You just need to set the api_base to the litellm proxy
doc: https://docs.litellm.ai/docs/proxy/quick_start#using-with-openai-compatible-projects

import openai
client = openai.OpenAI(
    api_key="anything",
    base_url="http://0.0.0.0:8000"
)

# request sent to model set on litellm proxy, `litellm --model`
response = client.chat.completions.create(model="gpt-3.5-turbo", messages = [
    {
        "role": "user",
        "content": "this is a test request, write a short poem"
    }
])

print(response)

@thomashacker
Copy link
Collaborator

Thanks everyone! We added the environment variable for the BASE_URL proxy, if set, the OpenAI Generator will use the proxy. We want to make changes on the UI and improve how users interact with environment variables in the future, but not right now unfortunately. 🚀

@tan-yong-sheng
Copy link

Thanks everyone! We added the environment variable for the BASE_URL proxy, if set, the OpenAI Generator will use the proxy. We want to make changes on the UI and improve how users interact with environment variables in the future, but not right now unfortunately. 🚀

Hi @thomashacker, thanks for this wonderful feature. Wanna check with you if the embedding models in LiteLLM are supported? Thanks a lot!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants