-
Notifications
You must be signed in to change notification settings - Fork 66
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add initial support for Zephyr 7b Beta #41
Conversation
- add nx dep - make Message struct editable through changeset - helps with UI - initial streaming support works
Zepher-7b Beta does NOT support function calling. It doesn't understand how to do it and has not been trained for it. There are alternate models that have fine-tuned Zephyr for function calling, but those have licensing problems. They trained the model using OpenAI, which is a violation of the terms of use. |
- raise error when adding functions and not supported
* main: (22 commits) prep for v0.1.6 Fix Req retry delay updated for v0.1.5 release updated for v0.1.5 release update to 0.1.5 upgrade Req to v0.4.8 - contains a retry fix fix: remove unecessary api_key from json payload updated changelog prep for new v.0.1.4 release document overriding the api endpoint allow overriding OpenAI compatible API endpoint Update Req to 0.4.7 Update Req to version 0.46 expanded comment Pass api_key to request if present in chat Allow passing api_key to ChatOpenAI added Utils.ChainResult module - helper functions for working with an LLMChain's result value preparation for v0.1.3 release Lessen retry delay to 300ms Add retry strategy to OpenAI Chat API requests ...
Have you thought about any ways to support I haven't tried this yet, but just wanted to throw the idea out there. |
@acalejos Yes! As you probably know by now, I interviewed Thomas Millar about InstructorEx in the episode that came out today. The challenge is that Instructor doesn't work with Bumblebee yet, and relies on a llamacpp ability to restrict the output grammar, forcing it into a compliant JSON structure. I'm very interested in the work going on there and this direction. It's very cool. |
* main: (27 commits) fixed documentation warning updated changelog prep for v0.1.7 release retry connection when underlying mint connection closed - does a limited retry count of 3 be more permissive with ecto dep updated deps updated ex_doc fix: rebase and integrate merge conflicts feat: add unit tests, fix errors feat: streaming support for Google AI feat: Google AI support without streaming. updated to use req streaming api - detects Mint :closed error and does a retry which worked in local tests added test for expected response from streamed response body Update ecto 3.10.3 -> 3.11.1 Cleanup non-api test warning output link UI display text for a function to the function itself cleanup add new RoutingChain with PromptRoute - important for more complex assistants - first pass operation classifies which direction the user's prompt should go - return the desired chain for performing the user's request ChatOpenAI update for fake API responses - support returning fake error responses add TextToTitleChain - simple helper chain for summarizing a user's prompt into a title ...
* main: handle receiving JSON data broken up over multiple messages updated changelog update for v0.1.8 release code formatting Add mistral chat updated changelog updated changelog doc updates breaking change for routing_chain - RoutingChain now takes a default_route instead of default_chain - takes the default route's name into account in the generated LLM prompt - returns the selected route instead of route.chain Add max_tokens option for OpenAI calls. Add clause to match call_response spec Update lib/chat_models/chat_ollama_ai.ex Add support for Ollama open source models
- includes tests - some docs included for serving settings
This is for running the model directly on hardware using Nx and Bumblebee.
The Zephyr 7B beta LLM doesn't have all the capabilities of ChatGPT, nor the safeguards.
What works:
What doesn't work:
Closes #26