Releases: foundation-model-stack/fms-acceleration
Releases · foundation-model-stack/fms-acceleration
v0.4.0.4
Release
peft
v0.3.4: patch version. Address #90 for AutoGPTQ when certain parameters require resizing.foak
v0.3.3: patch version. Address bug introduced in #90 where the grad accum hooks were overwritten.
What's Changed
- Fix Issue with Resizing Parameters on the Meta Device in Low CPU Mem Mode by @fabianlim in #96
- model: Add granite GPTQ model by @willmj in #95
New Contributors
Full Changelog: v0.4.0.3...v0.4.0.4
Quickfix: Properly Apply Retie Weights Fix for AutoGPTQ
Release
peft
v0.3.3: patch version. Properly fix #90 for AutoGPTQ
What's Changed
- Apply Retie Weights Fix Regardless of Transformers and TRL version for AutoGPTQ by @fabianlim in #94
Full Changelog: v0.4.0.2...v0.4.0.3
v0.4.0.2
Release
peft
v0.3.2: patch version. updatedaccelerate.yaml
for v1. address all low CPU mem issues for quant models.foak
v0.3.2: patch version: updated datatype support matrix. fix error introduced in #86. address all low CPU mem issues for quant models.
What's Changed
- Quickfix: Accelerate YAML and LoRA Fused Ops by @fabianlim in #92
- Fix Low CPU Memory Mode Issues for Quantized Peft by @fabianlim in #90
Full Changelog: v0.4.0.1...v0.4.0.2
Benchmarks for PaddingFree and Granite. Fix for LowCPUMemMode for Quant.
Release
aadp
no version bump: Updates on PaddingFree bench only.peft
v0.3.1: patch version. fixes forlow_cpu_mem_mode
for issues introduced since transformers0.45
. Also provide fallback iftarget_modules=None
.foak
v0.3.1: patch version: Support forbias
, needed for Granite models.
What's Changed
- Update Benches: Orca by @fabianlim in #85
- Update Benchmarks and Documentation for GraniteCausalLM by @fabianlim in #86
- Fixes to Accelerated Peft by @fabianlim in #89
Full Changelog: v0.4.0...v0.4.1
v0.4.0
Release
framework
v0.4, minor version:ModelPatcher
now allows multiple reload targets that point to the same file.aadp
v0.1.1, patch: Fix onflash_attn_forward
patching for "transformers < 0.44".peft
v0.3.0: minor version. very minor fixesfoak
v0.3.0: minor version: provideFastKernelsAccelerationPlugin
that supercedesFastQuantizedPeftAccelerationPlugin
. This new plugin also works for full-FT as well as regular peft.
What's Changed
- Allow Kernels for Full FT and Non-Quantized PEFT by @fabianlim in #79
Full Changelog: v0.3.0.1...v0.4.0
Patch Fix: Wrong Assertion in Accelerated Peft
Release
peft v0.2.1: fix wrong assertion on target modules in peft_config
What's Changed
Full Changelog: v0.3.0...v0.3.0.1
Acceleration Patcher, new AttentionAndDistributedPacking Plugin (previously ilab), Benchmarking Fixes
Release
framework
v0.3, minor version:Acceleration Patcher
now provided in framework.aadp
v0.1, new_plugin: Replacement of theilab
pluginpeft
v0.2.0: minor version bump. supportall-linear
intarget_modules
foak
v0.2.1: patch release: formatting fixes
What's Changed
- Rectify Missing Dataloader Preparation Call in PaddingFree Plugin Method by @achew010 in #63
- Rename Plugin to
AttentionAndDistributedPacking
by @achew010 in #64 - Add Benchmarking Compatibility to PaddingFree Plugin by @achew010 in #66
- Benchmarking: Add Response Field to Use Chat Templates Without Response Template by @fabianlim in #68
- Add Acceleration Patcher and MultiPack Plugin by @fabianlim in #67
- Fix formatter by @achew010 in #74
- Allow PaddingFree to work with DataCollatorForCompletionOnlyLM by @fabianlim in #78
- fixed bug in peft installation for gptqmodel by @achew010 in #81
Full Changelog: v0.2.0...v0.3.0
Model Patcher moved to Framework, Instruct Lab Plugin
Release
framework
v0.2, minor version:ModelPatcher
now moved into framework. Also now bench supports pretokenized datasetsilab
v0.1, new_plugin: New plugin with padding free support (native after transformers 4.44peft
v0.1.1: Patch bump. Minor changes.foak
v0.2: Minor patch bump withModelPatcher
moved out from it.
Update: we decided to remove the ilab
plugin and replace it with an attention-and-distributed-packing
plugin in the upcoming releases
What's Changed
- Refactored Model Patcher Class by @achew010 in #55
- Address Package Bound and Triton Issues for Torch 2.4 by @fabianlim in #58
- Introduce Padding-Free Plugin to FMS-Acceleration by @achew010 in #57
- Allow Bench To Configure Data Processing Pipeline Per Scenario by @fabianlim in #60
- Fix Mistakes with FA Padding Free by @fabianlim in #62
- Additional README Changes for PR #57 by @achew010 in #61
Full Changelog: v0.1.2.0...v0.2.0
Framework Updates and Standalone Extraction of AutoGPTQ
What's Changed
Release
framework v0.1.2
: supports default argument with_check_config_and_maybe_check_values
accelerated-peft v0.1.0.1
with extracted AutoGPTQ based on ModelCloud's rewrite. NOTE: due to some issues with thev0.1.0
release it was removed and we go directly tov0.1.0.1
Full Changelog: v0.1.1.2...v0.1.2.0
v0.1.1.1
What's Changed
- Remove the Float16 Restriction on BNB QLoRA by @fabianlim in #47
PublishAccelerated Peft and Fused-Ops Plugins by @achew010 in #51
Released
accelerated-peft 0.1.0
: release failed here, will be redone.fused-ops-and-kernels 0.1.0
:
Full Changelog: v0.1.1...v0.1.1.1