Highlights

New methods

OLoRA

@tokenizer-decode added support for a new LoRA initialization strategy called OLoRA (#1828). With this initialization option, the LoRA weights are initialized to be orthonormal, which promises to improve training convergence. Similar to PiSSA, this can also be applied to models quantized with bitsandbytes. Check out the accompanying OLoRA examples.

X-LoRA

@EricLBuehler added the X-LoRA method to PEFT (#1491). This is a mixture of experts approach that combines the strength of multiple pre-trained LoRA adapters. Documentation has yet to be added but check out the X-LoRA tests for how to use it.

FourierFT

@Phoveran, @zqgao22, @Chaos96, and @DSAILatHKUST added discrete Fourier transform fine-tuning to PEFT (#1838). This method promises to match LoRA in terms of performance while reducing the number of parameters even further. Check out the included FourierFT notebook.

HRA

@DaShenZi721 added support for Householder Reflection Adaptation (#1864). This method bridges the gap between low rank adapters like LoRA on the one hand and orthogonal fine-tuning techniques such as OFT and BOFT on the other. As such, it is interesting for both LLMs and image generation models. Check out the HRA example on how to perform DreamBooth fine-tuning.

Enhancements

IA³ now supports merging of multiple adapters via the add_weighted_adapter method thanks to @alexrs (#1701).
Call peft_model.get_layer_status() and peft_model.get_model_status() to get an overview of the layer/model status of the PEFT model. This can be especially helpful when dealing with multiple adapters or for debugging purposes. More information can be found in the docs (#1743).
DoRA now supports FSDP training, including with bitsandbytes quantization, aka QDoRA ()#1806).
VeRA has been extended by @dkopi to support targeting layers with different weight shapes (#1817).
@kallewoof added the possibility for ephemeral GPU offloading. For now, this is only implemented for loading DoRA models, which can be sped up considerably for big models at the cost of a bit of extra VRAM (#1857).
Experimental: It is now possible to tell PEFT to use your custom LoRA layers through dynamic dispatching. Use this, for instance, to add LoRA layers for thus far unsupported layer types without the need to first create a PR on PEFT (but contributions are still welcome!) (#1875).

Examples

@shirinyamani added a script and a notebook to demonstrate DoRA fine-tuning.
@rahulbshrestha contributed a notebook that shows how to fine-tune a DNA language model with LoRA.

Changes

Casting of the adapter dtype

Important: If the base model is loaded in float16 (fp16) or bfloat16 (bf16), PEFT now autocasts adapter weights to float32 (fp32) instead of using the dtype of the base model (#1706). This requires more memory than previously but stabilizes training, so it's the more sensible default. To prevent this, pass autocast_adapter_dtype=False when calling get_peft_model, PeftModel.from_pretrained, or PeftModel.load_adapter.

Adapter device placement

The logic of device placement when loading multiple adapters on the same model has been changed (#1742). Previously, PEFT would move all adapters to the device of the base model. Now, only the newly loaded/created adapter is moved to the base model's device. This allows users to have more fine-grained control over the adapter devices, e.g. allowing them to offload unused adapters to CPU more easily.

PiSSA

Calling save_pretrained with the convert_pissa_to_lora argument is deprecated, the argument was renamed to path_initial_model_for_weight_conversion (#1828). Also, calling this no longer deletes the original adapter (#1933).
Using weight conversion (path_initial_model_for_weight_conversion) while also using use_rslora=True and rank_pattern or alpha_pattern now raises an error (#1930). This used not to raise but inference would return incorrect outputs. We also warn about this setting during initialization.

Call for contributions

We are now making sure to tag appropriate issues with the contributions welcome label. If you are looking for a way to contribute to PEFT, check out these issues.

What's Changed

Bump version to 0.11.1.dev0 by @BenjaminBossan in #1736
save and load base model with revision by @mnoukhov in #1658
Autocast adapter weights if fp16/bf16 by @BenjaminBossan in #1706
FIX BOFT setting env vars breaks C++ compilation by @BenjaminBossan in #1739
Bump version to 0.11.2.dev0 by @BenjaminBossan in #1741
TST: torch compile tests by @BenjaminBossan in #1725
Add add_weighted_adapter to IA3 adapters by @alexrs in #1701
ENH Layer/model status shows devices now by @BenjaminBossan in #1743
Fix warning messages about config.json when the base model_id is local. by @elementary-particle in #1668
DOC TST Document and test reproducibility with models using batch norm by @BenjaminBossan in #1734
FIX Use correct attribute name for HQQ in merge by @BenjaminBossan in #1791
fix docs by @pacman100 in #1793
FIX Allow same layer adapters on different devices by @BenjaminBossan in #1742
TST Install bitsandbytes for compile tests by @BenjaminBossan in #1796
FIX BOFT device error after PR 1742 by @BenjaminBossan in #1799
TST Add regression test for DoRA, VeRA, BOFT, LN Tuning by @BenjaminBossan in #1792
Docs / LoRA: Add more information on merge_and_unload docs by @younesbelkada in #1805
TST: Add simple BNB regression tests by @BenjaminBossan in #1602
CI Make torch compile tests run on GPU by @BenjaminBossan in #1808
MNT Remove deprecated use of load_in_8bit by @BenjaminBossan in #1811
Refactor to make DoRA and QDoRA work with FSDP by @BenjaminBossan in #1806
FIX CI: Remove potentially problematic git command by @BenjaminBossan in #1820
ENH / Workflow: Notify on slack about peft + transformers main test results by @younesbelkada in #1821
FIX CI: Install pytest-reportlog package by @BenjaminBossan in #1822
ENH / Workflow: Use repository variable by @younesbelkada in #1823
Patch for Cambricon MLUs test by @huismiling in #1747
Fix a documentation typo by @sparsh2 in #1833
FIX Failing Llama tests due to new kv cache by @BenjaminBossan in #1832
Workflow / Bnb: Add a mechanism to inform us if the import fails by @younesbelkada in #1830
Workflow: Fix broken messages by @younesbelkada in #1842
feat(ci): add trufflehog secrets detection by @McPatate in #1841
DOC Describe torch_device argument in from_pretrained docstring by @BenjaminBossan in #1843
Support for different layer shapes for VeRA by @dkopi in #1817
CI Activate env to prevent bnb import error by @BenjaminBossan in #1845
Fixed PeftMixedModel docstring example #1824 by @namanvats in #1850
MNT Upgrade ruff version to ~0.4.8 by @BenjaminBossan in #1851
Adding support for an optional initialization strategy OLoRA by @tokenizer-decode in #1828
FIX: Adalora ranknum loaded on wrong device by @BenjaminBossan in #1852
Workflow / FIX: Fix red status on our CI by @younesbelkada in #1854
DOC FIX Comment about init of LoRA Embedding by @BenjaminBossan in #1855
DOC Move helpers section to dev developer guide by @BenjaminBossan in #1856
CI Testing: Remove import check by @BenjaminBossan in #1859
Update lora_based_methods.md by @jtatman in #1861
FIX multitask prompt tuning paper link by @cep-ter in #1862
Workflow: Attempt to fix the current failures by @younesbelkada in #1868
CI testing BNB: remove single GPU tests by @BenjaminBossan in #1866
CI Downgrade numpy to <2.0 for Mac and Windows by @BenjaminBossan in #1871
FIX Error when using VeRA with float16 or bfloat16 by @BenjaminBossan in #1874
Workflow: Update bug report template by @younesbelkada in #1882
ENH: LoRA support for dynamically dispatching to custom layers by @BenjaminBossan in #1875
FIX Init AdaLoRA to be identity transform by @BenjaminBossan in #1884
FIX Make special LoRA inits DeepSpeed compatible by @BenjaminBossan in #1887
bypass print_trainable_parameter() if model is not peft model by @delock in #1888
Fix early import of torch extension in BOFT by @PhyscalX in #1879
Dora Fine-tuning added to examples by @shirinyamani in #1885
CI: Don't fail fast in test matrix by @BenjaminBossan in #1896
FIX TEST: Higher tolerance for AdaLoRA in test by @BenjaminBossan in #1897
test: bump absolute tolerance level in test by @kallewoof in #1891
ephemeral GPU offload support by @kallewoof in #1857
FIX TEST Even higher tolerance for AdaLoRA in test by @BenjaminBossan in #1898
FIX Recursion while accessing attribute before initialization by @ret-1 in #1892
chore: markdown formatting by @stillmatic in #1899
Tutorial Notebook: Using the PEFT library with a DNA Language Model. by @rahulbshrestha in #1873
Integrate X-LoRA by @EricLBuehler in #1491
FIX: Flaky multitask prompt tuning test fixed by setting the seed by @BenjaminBossan in #1908
FourierFT Support by @Phoveran in #1838
fix参数encoder_reparameterization_type by @sujeek in #1926
Fix attribute check for print_trainable_parameters method by @anch0vy in #1928
Synchronize lora's merge, unmerge, etc. modifications to lora's tp_layer. by @zhangsheng377 in #1919
support HRA by @DaShenZi721 in #1864
FIX PiSSA & OLoRA with rank/alpha pattern, rslora by @BenjaminBossan in #1930
support Grouped-Query Attention by @ttw1018 in #1901
FIX: More VeRA tests, fix tests, more checks by @BenjaminBossan in #1900
[WIP] ENH Add support for Qwen2 by @BenjaminBossan in #1906
Decrease memory usage of merge_and_unload by @snarayan21 in #1944
PiSSA, OLoRA: Delete initial adapter after conversion instead of the active adapter by @BenjaminBossan in #1933
Release v0.12.0 by @BenjaminBossan in #1946

New Contributors

@mnoukhov made their first contribution in #1658
@elementary-particle made their first contribution in #1668
@sparsh2 made their first contribution in #1833
@McPatate made their first contribution in #1841
@dkopi made their first contribution in #1817
@namanvats made their first contribution in #1850
@tokenizer-decode made their first contribution in #1828
@jtatman made their first contribution in #1861
@cep-ter made their first contribution in #1862
@delock made their first contribution in #1888
@PhyscalX made their first contribution in #1879
@shirinyamani made their first contribution in #1885
@kallewoof made their first contribution in #1891
@ret-1 made their first contribution in #1892
@stillmatic made their first contribution in #1899
@rahulbshrestha made their first contribution in #1873
@Phoveran made their first contribution in #1838
@sujeek made their first contribution in #1926
@anch0vy made their first contribution in #1928
@DaShenZi721 made their first contribution in #1864
@ttw1018 made their first contribution in #1901
@snarayan21 made their first contribution in #1944

Full Changelog: v0.11.1...v0.12.0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v0.12.0: New methods OLoRA, X-LoRA, FourierFT, HRA, and much more