Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feature request: add llama.cpp support for local inference like privategpt #111

Open
x-legion opened this issue May 19, 2023 · 6 comments
Open

Comments

@x-legion
Copy link

for people like me who will never use closed source tool for anything.

@RobRoyce
Copy link
Member

Great idea, I'll look in to it 👍

@harrellbm
Copy link

https://github.com/Josh-XT/AGiXT

This is another project that has multi model support. Not sure if you could lift something from here.

@harrellbm
Copy link

@RobRoyce Also wouldn't mind looking into helping with those integrations if you have some pointers of where to look in the code

@RobRoyce
Copy link
Member

@harrellbm I would definitely appreciate the collaboration 👍

One of the biggest obstacles is not being able to (easily) run Python given the cross-platform nature of the app. I have a strong suspicion that Docker will be required to expand the model selection and feature set while still remaining local-only.

Currently the single point of entry for OpenAI is this line in the ChatController. The easiest way to test different models would be to replace this line with a different API call.

On the use of Docker: I have already started a repo that includes Apache Tika for text extraction of arbitrary local files. I will work on updating it to include a simple Python server with some LangChain functionality ASAP. That should give us a base image to work with for adding other models

@harrellbm
Copy link

@RobRoyce Sounds like a plan! I don't have a ton of time to code anymore with work and family but definitely want to help where I can!

Just to get my head in the right place the basic backend seems to be an electron app using express servers if I am correct? I know there is a lot more to it than that but just to get the basic idea of the architecture right in my head to start with. I have worked for a long time on a project called Superalgos. I can see there being some cool integrations between the projects but also really want to use knowledge myself!

@severian42
Copy link

So I've been trying to work on this problem and think I have found a solution but am not nearly skilled enough to pull it off. There has been a new way called Llamafile (https://github.com/Mozilla-Ocho/llamafile) to serve llama.cpp llms across multiple platforms in an already wrapped server package. This got me thinking that it may be possible to add this alongside Knowledge Canvas and utilize it to make all the chat completions locally. Someone even threw together an Electron based file start (https://github.com/swkidd/react-electron-llamafile-starter/tree/main).

I've been trying to integrate myself but can't seem to pull it off as I am not really familiar with this framework and programming is not my strongest skill. Would anyone here be willing to give it a go and see if they have better success?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants