Skip to content

Releases: foundation-model-stack/fms-acceleration

v0.4.0.4

31 Oct 16:49
Compare
Choose a tag to compare

Release

  • peft v0.3.4: patch version. Address #90 for AutoGPTQ when certain parameters require resizing.
  • foak v0.3.3: patch version. Address bug introduced in #90 where the grad accum hooks were overwritten.

What's Changed

  • Fix Issue with Resizing Parameters on the Meta Device in Low CPU Mem Mode by @fabianlim in #96
  • model: Add granite GPTQ model by @willmj in #95

New Contributors

Full Changelog: v0.4.0.3...v0.4.0.4

Quickfix: Properly Apply Retie Weights Fix for AutoGPTQ

25 Oct 05:11
Compare
Choose a tag to compare

Release

  • peft v0.3.3: patch version. Properly fix #90 for AutoGPTQ

What's Changed

  • Apply Retie Weights Fix Regardless of Transformers and TRL version for AutoGPTQ by @fabianlim in #94

Full Changelog: v0.4.0.2...v0.4.0.3

v0.4.0.2

23 Oct 09:08
Compare
Choose a tag to compare

Release

  • peft v0.3.2: patch version. updated accelerate.yaml for v1. address all low CPU mem issues for quant models.
  • foak v0.3.2: patch version: updated datatype support matrix. fix error introduced in #86. address all low CPU mem issues for quant models.

What's Changed

  • Quickfix: Accelerate YAML and LoRA Fused Ops by @fabianlim in #92
  • Fix Low CPU Memory Mode Issues for Quantized Peft by @fabianlim in #90

Full Changelog: v0.4.0.1...v0.4.0.2

Benchmarks for PaddingFree and Granite. Fix for LowCPUMemMode for Quant.

10 Oct 07:13
Compare
Choose a tag to compare

Release

  • aadp no version bump: Updates on PaddingFree bench only.
  • peft v0.3.1: patch version. fixes for low_cpu_mem_mode for issues introduced since transformers 0.45. Also provide fallback if target_modules=None.
  • foak v0.3.1: patch version: Support for bias, needed for Granite models.

What's Changed

Full Changelog: v0.4.0...v0.4.1

v0.4.0

16 Sep 06:40
Compare
Choose a tag to compare

Release

  • framework v0.4, minor version: ModelPatcher now allows multiple reload targets that point to the same file.
  • aadp v0.1.1, patch: Fix on flash_attn_forward patching for "transformers < 0.44".
  • peft v0.3.0: minor version. very minor fixes
  • foak v0.3.0: minor version: provide FastKernelsAccelerationPlugin that supercedes FastQuantizedPeftAccelerationPlugin. This new plugin also works for full-FT as well as regular peft.

What's Changed

  • Allow Kernels for Full FT and Non-Quantized PEFT by @fabianlim in #79

Full Changelog: v0.3.0.1...v0.4.0

Patch Fix: Wrong Assertion in Accelerated Peft

06 Sep 09:40
Compare
Choose a tag to compare

Release

peft v0.2.1: fix wrong assertion on target modules in peft_config

What's Changed

  • Fix Bug on Peft Config Check in AutoGPTQ Plugin by @achew010 in #82

Full Changelog: v0.3.0...v0.3.0.1

Acceleration Patcher, new AttentionAndDistributedPacking Plugin (previously ilab), Benchmarking Fixes

05 Sep 23:53
Compare
Choose a tag to compare

Release

  • framework v0.3, minor version: Acceleration Patcher now provided in framework.
  • aadp v0.1, new_plugin: Replacement of the ilab plugin
  • peft v0.2.0: minor version bump. support all-linear in target_modules
  • foak v0.2.1: patch release: formatting fixes

What's Changed

  • Rectify Missing Dataloader Preparation Call in PaddingFree Plugin Method by @achew010 in #63
  • Rename Plugin to AttentionAndDistributedPacking by @achew010 in #64
  • Add Benchmarking Compatibility to PaddingFree Plugin by @achew010 in #66
  • Benchmarking: Add Response Field to Use Chat Templates Without Response Template by @fabianlim in #68
  • Add Acceleration Patcher and MultiPack Plugin by @fabianlim in #67
  • Fix formatter by @achew010 in #74
  • Allow PaddingFree to work with DataCollatorForCompletionOnlyLM by @fabianlim in #78
  • fixed bug in peft installation for gptqmodel by @achew010 in #81

Full Changelog: v0.2.0...v0.3.0

Model Patcher moved to Framework, Instruct Lab Plugin

02 Aug 05:24
Compare
Choose a tag to compare

Release

  • framework v0.2, minor version: ModelPatcher now moved into framework. Also now bench supports pretokenized datasets
  • ilab v0.1, new_plugin: New plugin with padding free support (native after transformers 4.44
  • peft v0.1.1: Patch bump. Minor changes.
  • foak v0.2: Minor patch bump with ModelPatcher moved out from it.

Update: we decided to remove the ilab plugin and replace it with an attention-and-distributed-packing plugin in the upcoming releases

What's Changed

  • Refactored Model Patcher Class by @achew010 in #55
  • Address Package Bound and Triton Issues for Torch 2.4 by @fabianlim in #58
  • Introduce Padding-Free Plugin to FMS-Acceleration by @achew010 in #57
  • Allow Bench To Configure Data Processing Pipeline Per Scenario by @fabianlim in #60
  • Fix Mistakes with FA Padding Free by @fabianlim in #62
  • Additional README Changes for PR #57 by @achew010 in #61

Full Changelog: v0.1.2.0...v0.2.0

Framework Updates and Standalone Extraction of AutoGPTQ

17 Jul 01:47
Compare
Choose a tag to compare

What's Changed

Release

  • framework v0.1.2: supports default argument with _check_config_and_maybe_check_values
  • accelerated-peft v0.1.0.1 with extracted AutoGPTQ based on ModelCloud's rewrite. NOTE: due to some issues with the v0.1.0 release it was removed and we go directly to v0.1.0.1

Full Changelog: v0.1.1.2...v0.1.2.0

v0.1.1.1

15 Jul 03:55
Compare
Choose a tag to compare

What's Changed

  • Remove the Float16 Restriction on BNB QLoRA by @fabianlim in #47
  • Publish Accelerated Peft and Fused-Ops Plugins by @achew010 in #51

Released

  • accelerated-peft 0.1.0: release failed here, will be redone.
  • fused-ops-and-kernels 0.1.0:

Full Changelog: v0.1.1...v0.1.1.1