Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support starcoder family architectures (1B/3B/7B/13B) #3076

Closed
Tracked by #362
wsxiaoys opened this issue Sep 8, 2023 · 6 comments
Closed
Tracked by #362

Support starcoder family architectures (1B/3B/7B/13B) #3076

wsxiaoys opened this issue Sep 8, 2023 · 6 comments
Labels
model Model specific

Comments

@wsxiaoys
Copy link
Contributor

wsxiaoys commented Sep 8, 2023

Related Issues:

#1901
#1441
#1326

Previously, it wasn't recommended to incorporate non-llama architectures into llama.cpp. However, in light of the recent addition of the Falcon architecture (see Pull Request #2717), it might be worth reconsidering this stance.

One distinguishing feature of Starcoder is its ability to provide a complete series of models ranging from 1B to 13B. This capability can prove highly beneficial for speculative decoding and making coding models available for edge devices (e.g., M1/M2 Macs).

I can contribute the PR if it matches llama.cpp's roadmap.

@KerfuffleV2 KerfuffleV2 added the model Model specific label Sep 8, 2023
@Azeirah
Copy link
Contributor

Azeirah commented Sep 8, 2023

I was also looking for a small coding model. Someone on reddit recommended me to use stablecode 3b, it's based on neox architecture. I only just noticed that the model card says it's not supported by llama.cpp, but I did see that there's a convert script in this repo for gpt-neox so it might still be possible.

1b would of course be amazing too!

@ggerganov
Copy link
Owner

Yes, we can add more architectures - the main requirements are:

  • concise implementations
  • if the tokenizer is too complicated, just provide basic token-text mapping
  • don't break LLaMA

The ggml repo already provides sample Starcoder implementation:

https://github.com/ggerganov/ggml/tree/master/examples

So it is a good starting point to bring it here

@wsxiaoys
Copy link
Contributor Author

wsxiaoys commented Sep 8, 2023

Great - will start working on it

@wsxiaoys
Copy link
Contributor Author

done in #3187

@LaniakeaS
Copy link

LaniakeaS commented Dec 29, 2023

run python convert.py models/starcoder/ got following output. is that mean llama.cpp doesn't support starcoder 15B?

Traceback (most recent call last):
  File "/home/guest/**/llama.cpp/convert.py", line 1295, in <module>
    main()
  File "/home/guest/**/llama.cpp/convert.py", line 1223, in main
    model_plus = load_some_model(args.model)
  File "/home/guest/**/llama.cpp/convert.py", line 1144, in load_some_model
    model_plus = merge_multifile_models(models_plus)
  File "/home/guest/**/llama.cpp/convert.py", line 637, in merge_multifile_models
    model = merge_sharded([mp.model for mp in models_plus])
  File "/home/guest/**/llama.cpp/convert.py", line 616, in merge_sharded
    return {name: convert(name) for name in names}
  File "/home/guest/**/llama.cpp/convert.py", line 616, in <dictcomp>
    return {name: convert(name) for name in names}
  File "/home/guest/**/llama.cpp/convert.py", line 591, in convert
    lazy_tensors: list[LazyTensor] = [model[name] for model in models]
  File "/home/guest/**/llama.cpp/convert.py", line 591, in <listcomp>
    lazy_tensors: list[LazyTensor] = [model[name] for model in models]
KeyError: 'transformer.wte.weight'

@LaniakeaS
Copy link

probably not relevant about what model it is since found same problem in here #4530.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
model Model specific
Projects
None yet
Development

No branches or pull requests

5 participants