v0.12.0: New methods OLoRA, X-LoRA, FourierFT, HRA, and much more
Highlights
New methods
OLoRA
@tokenizer-decode added support for a new LoRA initialization strategy called OLoRA (#1828). With this initialization option, the LoRA weights are initialized to be orthonormal, which promises to improve training convergence. Similar to PiSSA, this can also be applied to models quantized with bitsandbytes. Check out the accompanying OLoRA examples.
X-LoRA
@EricLBuehler added the X-LoRA method to PEFT (#1491). This is a mixture of experts approach that combines the strength of multiple pre-trained LoRA adapters. Documentation has yet to be added but check out the X-LoRA tests for how to use it.
FourierFT
@Phoveran, @zqgao22, @Chaos96, and @DSAILatHKUST added discrete Fourier transform fine-tuning to PEFT (#1838). This method promises to match LoRA in terms of performance while reducing the number of parameters even further. Check out the included FourierFT notebook.
HRA
@DaShenZi721 added support for Householder Reflection Adaptation (#1864). This method bridges the gap between low rank adapters like LoRA on the one hand and orthogonal fine-tuning techniques such as OFT and BOFT on the other. As such, it is interesting for both LLMs and image generation models. Check out the HRA example on how to perform DreamBooth fine-tuning.
Enhancements
- IA³ now supports merging of multiple adapters via the
add_weighted_adapter
method thanks to @alexrs (#1701). - Call
peft_model.get_layer_status()
andpeft_model.get_model_status()
to get an overview of the layer/model status of the PEFT model. This can be especially helpful when dealing with multiple adapters or for debugging purposes. More information can be found in the docs (#1743). - DoRA now supports FSDP training, including with bitsandbytes quantization, aka QDoRA ()#1806).
- VeRA has been extended by @dkopi to support targeting layers with different weight shapes (#1817).
- @kallewoof added the possibility for ephemeral GPU offloading. For now, this is only implemented for loading DoRA models, which can be sped up considerably for big models at the cost of a bit of extra VRAM (#1857).
- Experimental: It is now possible to tell PEFT to use your custom LoRA layers through dynamic dispatching. Use this, for instance, to add LoRA layers for thus far unsupported layer types without the need to first create a PR on PEFT (but contributions are still welcome!) (#1875).
Examples
- @shirinyamani added a script and a notebook to demonstrate DoRA fine-tuning.
- @rahulbshrestha contributed a notebook that shows how to fine-tune a DNA language model with LoRA.
Changes
Casting of the adapter dtype
Important: If the base model is loaded in float16 (fp16) or bfloat16 (bf16), PEFT now autocasts adapter weights to float32 (fp32) instead of using the dtype of the base model (#1706). This requires more memory than previously but stabilizes training, so it's the more sensible default. To prevent this, pass autocast_adapter_dtype=False
when calling get_peft_model
, PeftModel.from_pretrained
, or PeftModel.load_adapter
.
Adapter device placement
The logic of device placement when loading multiple adapters on the same model has been changed (#1742). Previously, PEFT would move all adapters to the device of the base model. Now, only the newly loaded/created adapter is moved to the base model's device. This allows users to have more fine-grained control over the adapter devices, e.g. allowing them to offload unused adapters to CPU more easily.
PiSSA
- Calling
save_pretrained
with theconvert_pissa_to_lora
argument is deprecated, the argument was renamed topath_initial_model_for_weight_conversion
(#1828). Also, calling this no longer deletes the original adapter (#1933). - Using weight conversion (
path_initial_model_for_weight_conversion
) while also usinguse_rslora=True
andrank_pattern
oralpha_pattern
now raises an error (#1930). This used not to raise but inference would return incorrect outputs. We also warn about this setting during initialization.
Call for contributions
We are now making sure to tag appropriate issues with the contributions welcome
label. If you are looking for a way to contribute to PEFT, check out these issues.
What's Changed
- Bump version to 0.11.1.dev0 by @BenjaminBossan in #1736
- save and load base model with revision by @mnoukhov in #1658
- Autocast adapter weights if fp16/bf16 by @BenjaminBossan in #1706
- FIX BOFT setting env vars breaks C++ compilation by @BenjaminBossan in #1739
- Bump version to 0.11.2.dev0 by @BenjaminBossan in #1741
- TST: torch compile tests by @BenjaminBossan in #1725
- Add add_weighted_adapter to IA3 adapters by @alexrs in #1701
- ENH Layer/model status shows devices now by @BenjaminBossan in #1743
- Fix warning messages about
config.json
when the basemodel_id
is local. by @elementary-particle in #1668 - DOC TST Document and test reproducibility with models using batch norm by @BenjaminBossan in #1734
- FIX Use correct attribute name for HQQ in merge by @BenjaminBossan in #1791
- fix docs by @pacman100 in #1793
- FIX Allow same layer adapters on different devices by @BenjaminBossan in #1742
- TST Install bitsandbytes for compile tests by @BenjaminBossan in #1796
- FIX BOFT device error after PR 1742 by @BenjaminBossan in #1799
- TST Add regression test for DoRA, VeRA, BOFT, LN Tuning by @BenjaminBossan in #1792
- Docs / LoRA: Add more information on
merge_and_unload
docs by @younesbelkada in #1805 - TST: Add simple BNB regression tests by @BenjaminBossan in #1602
- CI Make torch compile tests run on GPU by @BenjaminBossan in #1808
- MNT Remove deprecated use of load_in_8bit by @BenjaminBossan in #1811
- Refactor to make DoRA and QDoRA work with FSDP by @BenjaminBossan in #1806
- FIX CI: Remove potentially problematic git command by @BenjaminBossan in #1820
- ENH / Workflow: Notify on slack about peft + transformers main test results by @younesbelkada in #1821
- FIX CI: Install pytest-reportlog package by @BenjaminBossan in #1822
- ENH / Workflow: Use repository variable by @younesbelkada in #1823
- Patch for Cambricon MLUs test by @huismiling in #1747
- Fix a documentation typo by @sparsh2 in #1833
- FIX Failing Llama tests due to new kv cache by @BenjaminBossan in #1832
- Workflow / Bnb: Add a mechanism to inform us if the import fails by @younesbelkada in #1830
- Workflow: Fix broken messages by @younesbelkada in #1842
- feat(ci): add trufflehog secrets detection by @McPatate in #1841
- DOC Describe torch_device argument in from_pretrained docstring by @BenjaminBossan in #1843
- Support for different layer shapes for VeRA by @dkopi in #1817
- CI Activate env to prevent bnb import error by @BenjaminBossan in #1845
- Fixed PeftMixedModel docstring example #1824 by @namanvats in #1850
- MNT Upgrade ruff version to ~0.4.8 by @BenjaminBossan in #1851
- Adding support for an optional initialization strategy OLoRA by @tokenizer-decode in #1828
- FIX: Adalora ranknum loaded on wrong device by @BenjaminBossan in #1852
- Workflow / FIX: Fix red status on our CI by @younesbelkada in #1854
- DOC FIX Comment about init of LoRA Embedding by @BenjaminBossan in #1855
- DOC Move helpers section to dev developer guide by @BenjaminBossan in #1856
- CI Testing: Remove import check by @BenjaminBossan in #1859
- Update lora_based_methods.md by @jtatman in #1861
- FIX multitask prompt tuning paper link by @cep-ter in #1862
- Workflow: Attempt to fix the current failures by @younesbelkada in #1868
- CI testing BNB: remove single GPU tests by @BenjaminBossan in #1866
- CI Downgrade numpy to <2.0 for Mac and Windows by @BenjaminBossan in #1871
- FIX Error when using VeRA with float16 or bfloat16 by @BenjaminBossan in #1874
- Workflow: Update bug report template by @younesbelkada in #1882
- ENH: LoRA support for dynamically dispatching to custom layers by @BenjaminBossan in #1875
- FIX Init AdaLoRA to be identity transform by @BenjaminBossan in #1884
- FIX Make special LoRA inits DeepSpeed compatible by @BenjaminBossan in #1887
- bypass print_trainable_parameter() if model is not peft model by @delock in #1888
- Fix early import of torch extension in BOFT by @PhyscalX in #1879
- Dora Fine-tuning added to examples by @shirinyamani in #1885
- CI: Don't fail fast in test matrix by @BenjaminBossan in #1896
- FIX TEST: Higher tolerance for AdaLoRA in test by @BenjaminBossan in #1897
- test: bump absolute tolerance level in test by @kallewoof in #1891
- ephemeral GPU offload support by @kallewoof in #1857
- FIX TEST Even higher tolerance for AdaLoRA in test by @BenjaminBossan in #1898
- FIX Recursion while accessing attribute before initialization by @ret-1 in #1892
- chore: markdown formatting by @stillmatic in #1899
- Tutorial Notebook: Using the PEFT library with a DNA Language Model. by @rahulbshrestha in #1873
- Integrate X-LoRA by @EricLBuehler in #1491
- FIX: Flaky multitask prompt tuning test fixed by setting the seed by @BenjaminBossan in #1908
- FourierFT Support by @Phoveran in #1838
- fix参数encoder_reparameterization_type by @sujeek in #1926
- Fix attribute check for print_trainable_parameters method by @anch0vy in #1928
- Synchronize lora's merge, unmerge, etc. modifications to lora's tp_layer. by @zhangsheng377 in #1919
- support HRA by @DaShenZi721 in #1864
- FIX PiSSA & OLoRA with rank/alpha pattern, rslora by @BenjaminBossan in #1930
- support Grouped-Query Attention by @ttw1018 in #1901
- FIX: More VeRA tests, fix tests, more checks by @BenjaminBossan in #1900
- [WIP] ENH Add support for Qwen2 by @BenjaminBossan in #1906
- Decrease memory usage of
merge_and_unload
by @snarayan21 in #1944 - PiSSA, OLoRA: Delete initial adapter after conversion instead of the active adapter by @BenjaminBossan in #1933
- Release v0.12.0 by @BenjaminBossan in #1946
New Contributors
- @mnoukhov made their first contribution in #1658
- @elementary-particle made their first contribution in #1668
- @sparsh2 made their first contribution in #1833
- @McPatate made their first contribution in #1841
- @dkopi made their first contribution in #1817
- @namanvats made their first contribution in #1850
- @tokenizer-decode made their first contribution in #1828
- @jtatman made their first contribution in #1861
- @cep-ter made their first contribution in #1862
- @delock made their first contribution in #1888
- @PhyscalX made their first contribution in #1879
- @shirinyamani made their first contribution in #1885
- @kallewoof made their first contribution in #1891
- @ret-1 made their first contribution in #1892
- @stillmatic made their first contribution in #1899
- @rahulbshrestha made their first contribution in #1873
- @Phoveran made their first contribution in #1838
- @sujeek made their first contribution in #1926
- @anch0vy made their first contribution in #1928
- @DaShenZi721 made their first contribution in #1864
- @ttw1018 made their first contribution in #1901
- @snarayan21 made their first contribution in #1944
Full Changelog: v0.11.1...v0.12.0