Skip to content

GPTQModel v0.9.2

Compare
Choose a tag to compare
@Qubitium Qubitium released this 29 Jun 12:15
· 272 commits to main since this release
6b3923e

What's Changed

Added auto-padding of model in/out-features for exllama and exllama v2. Fixed quantization of OPT and DeepSeek V2-Lite models. Fixed inference for DeepSeek V2-Lite.

New Contributors

Full Changelog: v0.9.1...v0.9.2