-
Notifications
You must be signed in to change notification settings - Fork 94
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feature request: add llama.cpp support for local inference like privategpt #111
Comments
Great idea, I'll look in to it 👍 |
https://github.com/Josh-XT/AGiXT This is another project that has multi model support. Not sure if you could lift something from here. |
@RobRoyce Also wouldn't mind looking into helping with those integrations if you have some pointers of where to look in the code |
@harrellbm I would definitely appreciate the collaboration 👍 One of the biggest obstacles is not being able to (easily) run Python given the cross-platform nature of the app. I have a strong suspicion that Docker will be required to expand the model selection and feature set while still remaining local-only. Currently the single point of entry for OpenAI is this line in the On the use of Docker: I have already started a repo that includes Apache Tika for text extraction of arbitrary local files. I will work on updating it to include a simple Python server with some LangChain functionality ASAP. That should give us a base image to work with for adding other models |
@RobRoyce Sounds like a plan! I don't have a ton of time to code anymore with work and family but definitely want to help where I can! Just to get my head in the right place the basic backend seems to be an electron app using express servers if I am correct? I know there is a lot more to it than that but just to get the basic idea of the architecture right in my head to start with. I have worked for a long time on a project called Superalgos. I can see there being some cool integrations between the projects but also really want to use knowledge myself! |
So I've been trying to work on this problem and think I have found a solution but am not nearly skilled enough to pull it off. There has been a new way called Llamafile (https://github.com/Mozilla-Ocho/llamafile) to serve llama.cpp llms across multiple platforms in an already wrapped server package. This got me thinking that it may be possible to add this alongside Knowledge Canvas and utilize it to make all the chat completions locally. Someone even threw together an Electron based file start (https://github.com/swkidd/react-electron-llamafile-starter/tree/main). I've been trying to integrate myself but can't seem to pull it off as I am not really familiar with this framework and programming is not my strongest skill. Would anyone here be willing to give it a go and see if they have better success? |
for people like me who will never use closed source tool for anything.
The text was updated successfully, but these errors were encountered: