Skip to content

GPTQModel v0.9.1

Compare
Choose a tag to compare
@Qubitium Qubitium released this 27 Jun 07:30
· 293 commits to main since this release
71ed742

What's Changed

v0.9.1 is a huge release with 3 new models added in addition to new BITBLAS support from Microsoft. Batching in .quantize() has been fixed so the process is now more than 50% faster for batches enabled on large number of calibration data. Also added quantized model sharding support with optional hash security checking of weight files on model load.

New Contributors

Full Changelog: v0.9.0...v0.9.1