Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Connecting to VLLM OpenAI API Compatible Server #123

Closed
JoshuaFurman opened this issue Mar 8, 2024 · 4 comments
Closed

Connecting to VLLM OpenAI API Compatible Server #123

JoshuaFurman opened this issue Mar 8, 2024 · 4 comments
Labels
enhancement New feature or request

Comments

@JoshuaFurman
Copy link

In my lab environment i am serving mixtral with VLLM using their OpenAI API compatible server and I'm hosting a weaviate instance as well.

I just spun up verba, pointing to both my weaviate instance and VLLM instance using the .env file and connection to weaviate seems to be all good. I can see my schema and object count in the status tab but any queries i make seem to break... Unsure if this is a limitation on being able to handle models other than GPT-3.5 || GPT-4 being served from OpenAI.

Has anyone been able to configure a setup like this?

Thanks!

@JoshuaFurman
Copy link
Author

Looks as though you are boxed in to use either ADA embedding from OpenAI or MiniLM or Cohere as well...

@thomashacker thomashacker added the investigating Bugs that are still being investigated whether they are valid label Apr 11, 2024
@thomashacker
Copy link
Collaborator

Yes, right now there is no support for Mixtral models! But great point, we'll look into that for the next update

@thomashacker thomashacker added enhancement New feature or request and removed investigating Bugs that are still being investigated whether they are valid labels Apr 11, 2024
@samos123
Copy link
Contributor

samos123 commented May 8, 2024

I got this working end to end but had to make some changes to be able to use my custom embedding model server. I submitted a PR for the changes I needed to be able to use OpenAI compatible API server for both embeddings and LLM: #148

I plan to publish an end to end tutorial that runs on K8s to install Verba, Weaviate, an LLM and an embedding model server all within the same K8s cluster. Stay tuned!

@samos123
Copy link
Contributor

samos123 commented May 9, 2024

I finished writing my guide for end-to-end private Verba RAG using Weaviate, Lingo, vLLM + Mistral 7b v2 and Sentence Transformers: https://www.substratus.ai/blog/lingo-weaviate-private-rag

Looking forward to hearing feedback. The guide should help you with figuring out how to use vanilla vLLM with Verba too.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants