Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

build: rename main → llama-cli, server → llama-server, llava-cli → llama-llava-cli, etc... #7809

Merged
merged 53 commits into from
Jun 12, 2024

Conversation

ochafik
Copy link
Collaborator

@ochafik ochafik commented Jun 6, 2024

This renames binaries to have a consistent llama- prefix.

  • mainllama-cli
  • any-other-examplellama-any-other-example, e.g:
    • llama-server,
    • llama-baby-llama,
    • llama-llava-cli,
    • llama-simple
    • ...
  • (rpc-server = only exception, name unchanged - see below)

Goal is to make any scripts / docs work across all the ways to install (cmake, brew, rpm, nix...).

For RPM-based distros, the cuda & clblast variants' prefix is llama-cuda- / llama-clblast-.

The Homebrew formula will be updated separately (llamallama-cli).

Note: can install a local build w/:

cmake -B build -DLLAMA_CURL=1 && cmake --build build -j && cmake --install build
# Uninstall: ( cd build && xargs rm < install_manifest.txt )

See discussion below about naming options considered.

@github-actions github-actions bot added documentation Improvements or additions to documentation script Script related nix Issues specific to consuming flake.nix, or generally concerned with ❄ Nix-based llama.cpp deployment examples python python script changes devops improvements to build systems and github actions server SYCL https://en.wikipedia.org/wiki/SYCL - GPU programming language labels Jun 6, 2024
@ochafik ochafik removed documentation Improvements or additions to documentation nix Issues specific to consuming flake.nix, or generally concerned with ❄ Nix-based llama.cpp deployment python python script changes devops improvements to build systems and github actions SYCL https://en.wikipedia.org/wiki/SYCL - GPU programming language labels Jun 6, 2024
@github-actions github-actions bot added documentation Improvements or additions to documentation nix Issues specific to consuming flake.nix, or generally concerned with ❄ Nix-based llama.cpp deployment python python script changes devops improvements to build systems and github actions SYCL https://en.wikipedia.org/wiki/SYCL - GPU programming language labels Jun 6, 2024
server Outdated Show resolved Hide resolved
@ngxson
Copy link
Collaborator

ngxson commented Jul 2, 2024

Personally I don't think there are any bad things about communications in this PR.

Llama.cpp has high velocity development, much like any other open-source project out there, so breaking changes are expected. We already had dedicated section on README to note about this and other breaking changes.

because I don't "make clean." I don't know what that is, nor is it mentioned in the readme

If this doesn't change, you will drive away users.

That's the reason why you shouldn't get mad. Llama.cpp is dev-oriented so will not as friendly as friendly GUI apps out there like jan or lmstudio. We assume that people using llama.cpp at least have basic knowledge about compiling. If you don't know something, then now you have a reason to ask & to learn about that, rather than making complaints.

Finally, I think we should give some credits to devs who dedicated their time to work on big changes like this one. It's not just taking energy to write the code, but also to communicate in a professional & friendly way.

NoumaanAhamed added a commit to NoumaanAhamed/Qwen2 that referenced this pull request Jul 3, 2024
Llama.cpp just updated their program names, I've updated the article to use the new name.

quantize -> llama-quantize
main -> llama-cli
simple -> llama-simple

[Check out the PR](ggerganov/llama.cpp#7809)
@oldgithubman
Copy link

oldgithubman commented Jul 3, 2024

Personally I don't think there are any bad things about communications in this PR.

Then why are there so many reports like mine? Not necessarily official reports, mind you (although I think I saw a couple official issues). Check reddit, for example. It's not just me. Also, @crashr predicted the problem early on. He was absolutely right and should have been taken more seriously.

Llama.cpp has high velocity development, much like any other open-source project out there

I don't see the correlation with speed and open-source. There are a full spectrum of development methodologies in both closed and open source. Fast and slow.

so breaking changes are expected

This should not be an excuse to disregard user experience. In this case (and others - again, a pattern here), the change was unnecessary and broke the user experience in a way that cost a lot of people a lot of resources. One person on reddit has been struggling for the past month. The breaking change itself wasn't the problem, perse. The problem was that the impact of the change could have been easily mitigated with the slightest effort. It's the same attitude I've been calling out for a while now. Too many devs simply care too little about the users, and I think you're illustrating my point.

We already had dedicated section on README to note about this and other breaking changes.

Which demonstrably wasn't good enough. The readme is notoriously unreliable.

because I don't "make clean." I don't know what that is, nor is it mentioned in the readme

If this doesn't change, you will drive away users.

That's the reason why you shouldn't get mad. Llama.cpp is dev-oriented so will not as friendly as friendly GUI apps out there like jan or lmstudio.

The readme details what you need to do to use llama.cpp, including "make." If the users are expected to do something different to not get screwed, it would make sense for the readme to reflect that. If you think all your users are supposed to know "make clean," then why do you tell them how to use "make?" This is obviously confusing and inconsistent. "Llama.cpp is dev-oriented so will not as friendly" is an excuse and a bad one at that. I'm not "mad" because you don't have a GUI. Maybe the readme should say in bold letters across the top, "If you don't know what 'make clean' is, f off."

We assume that people using llama.cpp at least have basic knowledge about compiling.

If that was true, you wouldn't tell people the basics of compilation in the readme. Also, the implication that I don't have basic knowledge about compiling is presumptuous and without merit. I compiled my first linux kernel over 20 years ago and have compiled thousands of programs since.

If you don't know something, then now you have a reason to ask & to learn about that, rather than making complaints.

You'll note I did ask about it and no one has responded. Also, I don't see how that precludes me from providing feedback. This is yet another case of someone criticizing me without even bothering to read my posts. You're telling me to do the very thing I've already done. You'll forgive me for not taking you very seriously moving forward, considering the smell of bad faith and elitism.

Finally, I think we should give some credits to devs who dedicated their time to work on big changes like this one.

Speak for yourself. I'm not in the habit of crediting moves that cause the community and me so much trouble for no benefit. Also, I guess I fail to see how renaming a bunch of binaries and breaking everyone's workflows (including organizations like huggingface) is a good thing worthy of praise. But I'm just a user, so what could I possibly know, right?

It's not just taking energy to write the code, but also to communicate in a professional & friendly way.

You're out of touch.

All that said, I do want to thank the devs for all the good work they do, as I often do. But for things to improve, criticism is required (tough love). I understand that's surprising to a lot of people these days.

Edit - I especially want to thank @HanClinto . Most devs could learn a lot from him. MVP

@crashr
Copy link
Contributor

crashr commented Jul 3, 2024

@oldmanjk You're not entirely wrong. I recognized the problem because I saw the old binary name in a recent screenshot on Reddit and wondered how that could happen. I generally welcome the change to the binary names, but I also don't see any way of preventing someone from falling foul of this. If, for example, the old binaries had been deleted via makefile, some pipeline somewhere would certainly have broken a productive system. Ok, at least one would have noticed it immediately. Ultimately, one should read the release notes before making any changes like compiling a new binary. Yes, very few of us do. But we should.

@HanClinto
Copy link
Collaborator

@crashr

I recognized the problem because I saw the old binary name in a recent screenshot on Reddit and wondered how that could happen. I generally welcome the change to the binary names, but I also don't see any way of preventing someone from falling foul of this.

Yeah, this is a bit concerning to me, and I'm trying to think of what more we can do to mitigate this fallout.

I've put some suggestions here, but I'm currently leaning towards:

  1. The second option is to add a minimalistic program that builds to the old binary names, but simply outputs a friendly error message that says that the binary names have changed, and to please update scripts and routines to use the new filenames.

There is more discussion in that PR, but I like the idea of overwriting the old binaries (at least for things like main, the server, and maybe quantization commands) so that they don't linger around, and provide people a nice breakpoint so that errors don't continue silently.

@oldmanjk

I especially want to thank @HanClinto .

Thanks for the shoutout, though it takes all kinds to move a project of this size forward, and I'm just a very small part. Nobody who works on this project takes breaking changes like this casually, and -- despite how it may appear from another vantage -- I hope you trust me when I say that nothing in this migration change was done recklessly or without consideration (the pages and pages of discussion at the beginning of the PR should illustrate this). I hope you'll give us all credit for that -- it's messy, difficult, and there are a lot of tradeoffs to balance, and sometimes we adapt on the fly (which is what we're doing here).

This name change should tide us over for a while, and hopefully we won't need another one until we move to the supercommand (and that one should be more backwards-compatible).

@oldgithubman
Copy link

You're not entirely wrong.

What am I wrong about?

I also don't see any way of preventing someone from falling foul of this.

There are many ways. @HanClinto and I are discussing a few in another thread.

Ok, at least one would have noticed it immediately.

Exactly. Although there are better ways.

Ultimately, one should read the release notes before making any changes like compiling a new binary.

I usually do when I'm installing releases. That's not what's happening here though. We're not compiling releases.

HanClinto added a commit to HanClinto/llama.cpp that referenced this pull request Jul 3, 2024
…st to help people notice the binary name change from ggerganov#7809 and migrate to the new filenames.
@crashr
Copy link
Contributor

crashr commented Jul 3, 2024

What am I wrong about?

The communication thing. What else could have be done but write it in big letters at the top?

@HanClinto
Copy link
Collaborator

What else could have be done but write it in big letters at the top?

Well, there are a few things, though some are most clear in hindsight.

One suggestion is implemented in #8283 -- please give it a look-see! :)

@oldgithubman
Copy link

Thanks for the shoutout

You're welcome. You deserved it.

I'm just a very small part.

Disagree. I'm going to assume you're just being humble, because based on what I've seen so far, you're intelligent enough to understand your true value.

Nobody who works on this project takes breaking changes like this casually

I understand what you're trying to do here, but strong disagree.

I hope you trust me

I do when you're not lying. I don't think you believe what you're saying. Again, I understand what you're trying to do, but it's misguided.

when I say that nothing in this migration change was done recklessly or without consideration

This migration change was absolutely done recklessly. There was consideration, but it wasn't taken seriously.

(the pages and pages of discussion at the beginning of the PR should illustrate this)

Disagree. They illustrate what I'm saying. Also, and more importantly, the results illustrate what I'm saying.

I hope you'll give us all credit for that

I give credit where credit is due. Like how we used to back in the day (as a society). The "everyone gets a trophy" mentality is more destructive than most people realize.

it's messy

When you make it that way. It certainly didn't have to be.

difficult

I understand software development is difficult, but I disagree that what was done here was difficult. Changing the names of binaries is not difficult. Changing the names of binaries without screwing the userbase is a little more difficult, but still not difficult. The review-complexity label on your new PR being "low" is evidence towards this point. You being able to instantly come up with three perfectly good solutions off the top of your head is evidence towards this point.

and there are a lot of tradeoffs to balance

The scale needs new batteries.

and sometimes we adapt on the fly (which is what we're doing here).

I lit a fire and you responded like an MVP with the best attitude here. That's what happened. Credit where credit is due. Just take the compliment. Also credit to @crashr for his excellent warning, even though it wasn't taken seriously.

This name change should tide us over for a while, and hopefully we won't need another one until we move to the supercommand (and that one should be more backwards-compatible).

Let's prepare for that

@HanClinto
Copy link
Collaborator

You're not entirely wrong.

What am I wrong about?

The personal deprecations and insinuations are getting tiresome. I'm glad that you appreciate my approach to customer service on bug reports, but aspersions cast at the other volunteers on this project (even if it's only by way of complimenting me to the isolation of others) is not helping me or anyone else. I'm a volunteer here, as is most everyone else, and we all need to strive to be professional and gracious.

A bit of grace and professional courtesy go a long way here -- it's a messy situation, and berating other devs with blame-and-shame is not going to help me do anything other than walk away for a while.

Thank you (and others) for the issue reports, but please recognize that continuing in this vein will wear me (and others) thin.

Repository owner deleted a comment from oldgithubman Jul 3, 2024
@HanClinto
Copy link
Collaborator

Nice censorship. You've lost my respect, "repo owner"

It's getting personal, off-topic, and starting to go in circles. If you want to continue this argument, there are more appropriate venues.

Thank you.

@oldgithubman
Copy link

You're not entirely wrong.

What am I wrong about?

The personal deprecations and insinuations are getting tiresome. I'm glad that you appreciate my approach to customer service on bug reports, but aspersions cast at the other volunteers on this project (even if it's only by way of complimenting me to the isolation of others) is not helping me or anyone else. I'm a volunteer here, as is most everyone else, and we all need to strive to be professional and gracious.

A bit of grace and professional courtesy go a long way here -- it's a messy situation, and berating other devs with blame-and-shame is not going to help me do anything other than walk away for a while.

Thank you (and others) for the issue reports, but please recognize that continuing in this vein will wear me (and others) thin.

This post is so off-base I'm not even going to bother anymore. My work here is done (for now)

@HanClinto
Copy link
Collaborator

HanClinto commented Jul 3, 2024

?That post might have been a mistake. Was I censored? I can't even remember. I edited before you responded

It was a repeat of a question aimed at Crashr. It's cluttering up this PR which is (currently) linked as the primary source for information about the name migration. Continuing in circles with personal arguments will make it more difficult for other users to sift through and find relevant information about the change in the future.

Because of the prominence of this PR, anything we can do to keep this thread focused and un-cluttered is good. If you'd like to continue, maybe moving things to a less prominent location would be best (perhaps #8283?).

Thank you.

jklj077 added a commit to QwenLM/Qwen2.5 that referenced this pull request Jul 4, 2024
* Update llama.cpp.rst

Llama.cpp just updated their program names, I've updated the article to use the new name.

quantize -> llama-quantize
main -> llama-cli
simple -> llama-simple

[Check out the PR](ggerganov/llama.cpp#7809)

* Updated llama.cpp.rst ( removed -cml )

* Update llama.cpp.rst

---------

Co-authored-by: Ren Xuancheng <[email protected]>
HanClinto added a commit that referenced this pull request Jul 9, 2024
* Adding a simple program to provide a deprecation warning that can exist to help people notice the binary name change from #7809 and migrate to the new filenames.

* Build legacy replacement binaries only if they already exist. Check for their existence every time so that they are not ignored.
arthw pushed a commit to arthw/llama.cpp that referenced this pull request Jul 13, 2024
…rganov#8283)

* Adding a simple program to provide a deprecation warning that can exist to help people notice the binary name change from ggerganov#7809 and migrate to the new filenames.

* Build legacy replacement binaries only if they already exist. Check for their existence every time so that they are not ignored.
arthw pushed a commit to arthw/llama.cpp that referenced this pull request Jul 13, 2024
…rganov#8283)

* Adding a simple program to provide a deprecation warning that can exist to help people notice the binary name change from ggerganov#7809 and migrate to the new filenames.

* Build legacy replacement binaries only if they already exist. Check for their existence every time so that they are not ignored.
@hgftrdw45ud67is8o89
Copy link

personally i don't think this change is needed? Because i thought the old scripts or docs is already stable? But definitely good to make it neat.

But,If its going to be renamed.
We might as well use less confusing name it would be better to just prepend "llamacpp" instead of "llama".
as llama commonly refers to the "llama" arch and not "llamacpp". It have the equal ambiguous level of "main" or "run" or "llm" or "mistral".

Did no one noticed this problem?

@HanClinto
Copy link
Collaborator

personally i don't think this change is needed? Because i thought the old scripts or docs is already stable? But definitely good to make it neat.

But,If its going to be renamed. We might as well use less confusing name it would be better to just prepend "llamacpp" instead of "llama". as llama commonly refers to the "llama" arch and not "llamacpp". It have the equal ambiguous level of "main" or "run" or "llm" or "mistral".

Did no one noticed this problem?

llamacpp was suggested on June 8th, and discussed further on June 9th. Final decision was made on June 10th.

I agree with you that llamacpp feels better (at least to me), but I'm still quite content with llama and I'm in support. Moving away from the main name is a good change.

@oldgithubman
Copy link

personally i don't think this change is needed?

Agreed. I can see some small benefit, but it certainly wasn't worth the damage done. Had it been done in a manner that didn't break anything (and yes, this was very possible), I would have supported it.

Because i thought the old scripts or docs is already stable?

Well, stable might be a stretch, but not breaking things is obviously more-stable.

But definitely good to make it neat.

If you do it without breaking things.

But,If its going to be renamed. We might as well use less confusing name it would be better to just prepend "llamacpp" instead of "llama".

Agreed, but IMO, not prepending everything with the same thing is even better. Even better is to not fix what isn't broken, so I think it's best to just leave it alone now.

as llama commonly refers to the "llama" arch and not "llamacpp". It have the equal ambiguous level of "main" or "run" or "llm" or "mistral".

Agreed. I don't think the usage of "llama" really makes sense at all anymore for this project, but again, better to leave it now.

Did no one noticed this problem?

It never made sense to me, but I assumed it was used for a good reason

@hgftrdw45ud67is8o89
Copy link

Thanks @HanClinto for giving me reference. My bad for not noticing /reading the full convo.
Its fine to keep it as it is. I just naively thought that no one suggested that idea or noticed the flaw.

mishig25 pushed a commit to huggingface/huggingface.js that referenced this pull request Aug 7, 2024
In this PR, I propose some changes:
- Update binary name to `llama-cli` (for more details, see this PR:
ggerganov/llama.cpp#7809 and this [homebrew
formula](https://github.com/Homebrew/homebrew-core/blob/03cf5d39d8bf27dfabfc90d62c9a3fe19205dc2a/Formula/l/llama.cpp.rb))
- Add method to download llama.cpp via pre-built release
- Split snippet into 3 sections: `title`, `setup` and `command`
- Use `--conversation` mode to start llama.cpp in chat mode (chat
template is now supported, ref:
ggerganov/llama.cpp#8068)

---

Proposal for the UI:

(Note: Maybe the 3 sections title - setup - command can be more
separated visually)


![image](https://github.com/huggingface/huggingface.js/assets/7702203/2bd302f0-88b1-4057-9cd3-3cf4536aae50)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
breaking change Changes that break ABIs, APIs, file formats, or other forms of backwards compatibility. devops improvements to build systems and github actions documentation Improvements or additions to documentation examples help wanted Extra attention is needed merge ready indicates that this may be ready to merge soon and is just holding out in case of objections nix Issues specific to consuming flake.nix, or generally concerned with ❄ Nix-based llama.cpp deployment python python script changes Review Complexity : Medium Generally require more time to grok but manageable by beginner to medium expertise level script Script related server SYCL https://en.wikipedia.org/wiki/SYCL - GPU programming language
Projects
None yet
Development

Successfully merging this pull request may close these issues.