Support for fastLLaMa? #575

official-elinas · 2023-03-25T22:59:11Z

Description

https://github.com/PotatoSpudowski/fastLLaMa

It's a Python wrapper on the llama.cpp implementation. I feel that this would be easier than directly using llama.cpp in terms of integration.

Additional Info

This would be useful for running larger models like llama-65B due to the VRAM requirements.

oobabooga · 2023-03-29T02:32:26Z

Let's make this thread the meta for this

Loufe · 2023-03-29T03:40:59Z

Please correct me if I'm misunderstanding, but FastLLaMA looks less like a wrapper and more like a fork of LLaMA.cpp, following pull #370.

If that's in fact the case: everyone should keep in mind that, much like this project, llama.cpp is developing a breakneck speed. They, too, have experienced model-conversion-requiring changes a couple times recently. I think if the project was truly a wrapper this would be something to keep an eye on but manageable. However being a fork you'd be relying on PotatoSpudowski et al to downstream any changes. If any changes aren't downstreamed, models would be incompatible between FastLlama and LLama.cpp until they have been. This would have the unfortunate consequence of adding yet another class of model to the pool of conversions+lora packs+formats floating, which is already... large (USBHost can attest to community fatigue with prolific model branching).

Again, I might be misunderstanding a good amount of information here, but after a little reading on the shared library approach I'm wondering if someone more knowledgeable could weigh in. Is there any credence to an approach of instead including LLaMA.cpp as a direct build requirement in this project with only minimal additional interfacing?

edit: After reading FastLLama code a little bit more, I think I still feel similarly but just like the approach of adding llama.cpp to requirements rather than adding a static copy within the project. The bridge.cpp file is probably entirely valid still.

oobabooga · 2023-03-29T03:58:47Z

In ggerganov/llama.cpp#370 it is said that llama.cpp now has its own API, but there is no documentation about it anywhere.

PotatoSpudowski · 2023-04-14T12:13:38Z

Hi,
Really cool that you guys found the repo interesting. Initially it was just a quick thing my friend and I hacked together. Based on the feedback from folks we have been working on make the repo more usable. We have a new branch named 'feature/refactor' we did a few things like updating the ggml library, removing python version dependency and made the setup process easier. We will be merging it to main in a few hours.

@oobabooga if this is something you are still interested, I can help you in any way possible :)

BTW love the project!!!

Loufe · 2023-04-14T12:40:06Z

@PotatoSpudowski just for your information, ooba eventually moved forward with llama-cpp-python.

Ooba this issue might be worth closing?

official-elinas added the enhancement New feature or request label Mar 25, 2023

oobabooga mentioned this issue Mar 29, 2023

[Request] Support for llama.cpp #250

Closed

oobabooga closed this as completed Apr 24, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support for fastLLaMa? #575

Support for fastLLaMa? #575

official-elinas commented Mar 25, 2023 •

edited

Loading

oobabooga commented Mar 29, 2023

Loufe commented Mar 29, 2023 •

edited

Loading

oobabooga commented Mar 29, 2023

PotatoSpudowski commented Apr 14, 2023 •

edited

Loading

Loufe commented Apr 14, 2023

Support for fastLLaMa? #575

Support for fastLLaMa? #575

Comments

official-elinas commented Mar 25, 2023 • edited Loading

oobabooga commented Mar 29, 2023

Loufe commented Mar 29, 2023 • edited Loading

oobabooga commented Mar 29, 2023

PotatoSpudowski commented Apr 14, 2023 • edited Loading

Loufe commented Apr 14, 2023

official-elinas commented Mar 25, 2023 •

edited

Loading

Loufe commented Mar 29, 2023 •

edited

Loading

PotatoSpudowski commented Apr 14, 2023 •

edited

Loading