Skip to content

Releases: unslothai/unsloth

Phi 3.5

21 Aug 01:08
Compare
Choose a tag to compare

Phi 3.5 is here!

Try it out here: https://colab.research.google.com/drive/1lN6hPQveB_mHSnTOYifygFcrO8C1bxq4?usp=sharing

What's Changed

New Contributors

Full Changelog: July-Mistral-2024...August-2024

Llama 3.1 Support

23 Jul 20:42
Compare
Choose a tag to compare

Llama 3.1 Support

Excited to announce Unsloth makes finetuning Llama 3.1 2.1x faster and use 60% less VRAM! Read up on our release here: https://unsloth.ai/blog/llama3-1
image

We uploaded a Google Colab notebook to finetune Llama 3.1 (8B) on a free Tesla T4: Llama 3.1 (8B) Notebook. We also have a new UI on Google Colab for chatting with your Llama 3.1 Instruct models which uses our own 2x faster inference engine.

Run UI Preview

unsloth_chat_ui_cHN5s0tryafdUM6nzXjgf
We created a new chat UI using Gradio where users can upload and chat with their Llama 3.1 Instruct models online for free on Google Colab.

We uploaded 4bit bitsandbytes quants here: https://huggingface.co/unsloth
To finetune Llama 3.1, please update Unsloth:

pip uninstall unsloth -y
pip install --upgrade --force-reinstall --no-cache-dir git+https://github.com/unslothai/unsloth.git

July-Mistral-2024

19 Jul 15:37
Compare
Choose a tag to compare

Mistral NeMo, Ollama & CSV support

See https://unsloth.ai/blog/mistral-nemo for more details. 4 bit pre-quantized weights at https://huggingface.co/unsloth

2x faster 60% less VRAM Colab finetuning notebook here and also our Kaggle notebook is here

image

Export to Ollama & CSV Support

To use, create and customize your chat template with a dataset and Unsloth will automatically export the finetune to Ollama with automatic Modelfile creation. We also created a 'Step-by-Step Tutorial on How to Finetune Llama-3 and Deploy to Ollama'. Check out our Ollama Llama-3 Alpaca and CSV/Excel Ollama Guide notebooks.

Unlike regular chat templates that use 3 columns, Ollama simplifies the process with just 2 columns: instruction and output. And with Ollama, you can save, run, and deploy your finetuned models locally on your own device.
image
image

Train on Completions / Inputs

We now support training only on the output tokens and not the inputs, which can increase accuracy. Try it with:

from trl import SFTTrainer
from transformers import TrainingArguments, DataCollatorForSeq2Seq
trainer = SFTTrainer(
    model = model,
    tokenizer = tokenizer,
    train_dataset = dataset,
    data_collator = DataCollatorForSeq2Seq(tokenizer = tokenizer),
    ...
    args = TrainingArguments(
        ...
    ),
)
from unsloth.chat_templates import train_on_responses_only
trainer = train_on_responses_only(trainer)

RoPE Scaling for all models

We now allow you to finetune Gemma 2, Mistral, Mistral NeMo, Qwen2 and more models with “unlimited” context lengths through RoPE linear scaling through Unsloth. Coupled with our 4x longer context support, Unsloth can do extremely long context support!

New Docs!

Introducing our new Documentation site which has all the most important info about Unsloth in one place. If you'd like to contribute, please contact us! Docs: https://docs.unsloth.ai/
image

Update instructions

Please update Unsloth in local machines (Colab and Kaggle just refresh and reload notebooks) via:

pip uninstall unsloth -y
pip install --upgrade --force-reinstall --no-cache-dir git+https://github.com/unslothai/unsloth.git

2x faster Gemma 2

03 Jul 22:02
5ab565f
Compare
Choose a tag to compare

Gemma 2 support

We now support Gemma 2! It's 2x faster and uses 63% less VRAM than HF+FA2!
image
We have a Gemma 2 9b notebook here: https://colab.research.google.com/drive/1vIrqH5uYDQwsJ4-OO3DErvuv4pBgVwk4?usp=sharing

To use Gemma 2, please update Unsloth:

pip uninstall unsloth -y
pip install --upgrade --force-reinstall --no-cache-dir git+https://github.com/unslothai/unsloth.git

Head over to our blog post: https://unsloth.ai/blog/gemma2 for more details.

We uploaded 4bit quants for 4x faster downloading to:

https://huggingface.co/unsloth/gemma-2-9b-bnb-4bit

https://huggingface.co/unsloth/gemma-2-27b-bnb-4bit

https://huggingface.co/unsloth/gemma-2-9b-it-bnb-4bit

https://huggingface.co/unsloth/gemma-2-27b-it-bnb-4bit

Continued pretraining

You can now do continued pretraining with Unsloth. See https://unsloth.ai/blog/contpretraining for more details!

Continued pretraining is 2x faster and uses 50% less VRAM than HF + FA2 QLoRA. We offload embed_tokens and lm_head to disk to save VRAM!

You can now simply use both in the target modules like below:

model = FastLanguageModel.get_peft_model(
    model,
    r = 128, # Choose any number > 0 ! Suggested 8, 16, 32, 64, 128
    target_modules = ["q_proj", "k_proj", "v_proj", "o_proj",
                      "gate_proj", "up_proj", "down_proj",
                      "embed_tokens", "lm_head",], # Add for continual pretraining
    lora_alpha = 32,
    lora_dropout = 0, # Supports any, but = 0 is optimized
    bias = "none",    # Supports any, but = "none" is optimized
    # [NEW] "unsloth" uses 30% less VRAM, fits 2x larger batch sizes!
    use_gradient_checkpointing = "unsloth", # True or "unsloth" for very long context
    random_state = 3407,
    use_rslora = True,  # We support rank stabilized LoRA
    loftq_config = None, # And LoftQ
)

We also allow 2 learning rates - one for the embedding matrices and another for the LoRA adapters:

from unsloth import is_bfloat16_supported
from unsloth import UnslothTrainer, UnslothTrainingArguments

trainer = UnslothTrainer(
    args = UnslothTrainingArguments(
        ....
        learning_rate = 5e-5,
        embedding_learning_rate = 5e-6,
    ),
)

We also share a free Colab to finetune Mistral v3 to learn Korean (you can select any language you like) using Wikipedia and the Aya Dataset: https://colab.research.google.com/drive/1tEd1FrOXWMnCU9UIvdYhs61tkxdMuKZu?usp=sharing

And we're sharing our free Colab notebook for continued pretraining for text completion: https://colab.research.google.com/drive/1ef-tab5bhkvWmBOObepl1WgJvfvSzn5Q?usp=sharing

What's Changed

New Contributors

Full Changelog: https://github.com/unslothai/unsloth/commits/June-2024