GPTQModel v1.0.4

Qubitium released this 26 Sep 04:26

· 41 commits to main since this release

v1.0.4

cffee9a

What's Changed

Liger Kernel support added for ~50% vram reduction in quantization stage for some models. Added toggle to disable parallel packing to avoid oom larger models. Transformers depend updated to 4.45.0 for Llama 3.2 support.

[FEATURE] add a parallel_packing toggle by @LRL-ModelCloud in #393
[FEATURE] add liger_kernel support by @LRL-ModelCloud in #394

Full Changelog: v1.0.3...v1.0.4

Contributors

LRL-ModelCloud

Assets 15

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GPTQModel v1.0.4

What's Changed

Contributors