You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm pretty sure this is because the webui repo uses the llama tokenizer, and mistral uses a different tokenizer. If you use the mistral tokenizer / AutoTokenizer you should get reasonable output. For example when running interactive_gen.py (our "chat" script) with 4 bit Mistral
Please enter your prompt or 'quit' (without quotes) to quit: Call me Ishmael
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Model Output: Call me Ishmael.
I’m an avid reader of Moby Dick, a book that I read every year or so. It’s one of my favorite books, and the reason for that is simple: Ishmael is my alter ego.
In fact, I
Testing on oobabooga webui as being implemented here.
Llama-2 models (13B 2Bit/4Bit) work as expected.
Tested models:
Typical output:
The text was updated successfully, but these errors were encountered: