feature request: add llama.cpp support for local inference like privategpt #111

x-legion · 2023-05-19T05:32:41Z

for people like me who will never use closed source tool for anything.

RobRoyce · 2023-05-19T05:38:45Z

Great idea, I'll look in to it 👍

harrellbm · 2023-05-22T13:55:34Z

This is another project that has multi model support. Not sure if you could lift something from here.

harrellbm · 2023-05-22T14:27:40Z

@RobRoyce Also wouldn't mind looking into helping with those integrations if you have some pointers of where to look in the code

RobRoyce · 2023-05-22T16:39:18Z

@harrellbm I would definitely appreciate the collaboration 👍

One of the biggest obstacles is not being able to (easily) run Python given the cross-platform nature of the app. I have a strong suspicion that Docker will be required to expand the model selection and feature set while still remaining local-only.

Currently the single point of entry for OpenAI is this line in the ChatController. The easiest way to test different models would be to replace this line with a different API call.

On the use of Docker: I have already started a repo that includes Apache Tika for text extraction of arbitrary local files. I will work on updating it to include a simple Python server with some LangChain functionality ASAP. That should give us a base image to work with for adding other models

harrellbm · 2023-05-22T19:33:05Z

@RobRoyce Sounds like a plan! I don't have a ton of time to code anymore with work and family but definitely want to help where I can!

Just to get my head in the right place the basic backend seems to be an electron app using express servers if I am correct? I know there is a lot more to it than that but just to get the basic idea of the architecture right in my head to start with. I have worked for a long time on a project called Superalgos. I can see there being some cool integrations between the projects but also really want to use knowledge myself!

severian42 · 2023-12-21T10:33:36Z

So I've been trying to work on this problem and think I have found a solution but am not nearly skilled enough to pull it off. There has been a new way called Llamafile (https://github.com/Mozilla-Ocho/llamafile) to serve llama.cpp llms across multiple platforms in an already wrapped server package. This got me thinking that it may be possible to add this alongside Knowledge Canvas and utilize it to make all the chat completions locally. Someone even threw together an Electron based file start (https://github.com/swkidd/react-electron-llamafile-starter/tree/main).

I've been trying to integrate myself but can't seem to pull it off as I am not really familiar with this framework and programming is not my strongest skill. Would anyone here be willing to give it a go and see if they have better success?

RobRoyce mentioned this issue Jun 8, 2023

Local llma support and altenative aside openai apikey #117

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feature request: add llama.cpp support for local inference like privategpt #111

feature request: add llama.cpp support for local inference like privategpt #111

x-legion commented May 19, 2023

RobRoyce commented May 19, 2023

harrellbm commented May 22, 2023

harrellbm commented May 22, 2023

RobRoyce commented May 22, 2023

harrellbm commented May 22, 2023

severian42 commented Dec 21, 2023

feature request: add llama.cpp support for local inference like privategpt #111

feature request: add llama.cpp support for local inference like privategpt #111

Comments

x-legion commented May 19, 2023

RobRoyce commented May 19, 2023

harrellbm commented May 22, 2023

harrellbm commented May 22, 2023

RobRoyce commented May 22, 2023

harrellbm commented May 22, 2023

severian42 commented Dec 21, 2023