Skip to content

GPTQModel v1.0.4

Compare
Choose a tag to compare
@Qubitium Qubitium released this 26 Sep 04:26
· 41 commits to main since this release
cffee9a

What's Changed

Liger Kernel support added for ~50% vram reduction in quantization stage for some models. Added toggle to disable parallel packing to avoid oom larger models. Transformers depend updated to 4.45.0 for Llama 3.2 support.

Full Changelog: v1.0.3...v1.0.4