release: 0.2.5
New features
- Load and save models from the Hugging Face hub #263 by @sayakpaul
- Add support for float8 e4f3mnuz #310 (from #281) by @maktukmak
- Faster and less memory-intensive requantization #290 by @latentCall145
- Support torch.equal for QTensor #294 by @dacorvo
- Add Marlin Float8 kernel #296 (from #241) by @fxmarty
- Add Whisper for speech recognition example #298 (from #242) by @mattiadg
- Add ViT classification example #308 by @shovan777
Bug fixes
- Fix include patterns in quantize #271 by @kaibioinfo
- Enable non-strict loading of state dicts #295 by @BenjaminBossan
- Fix transformers forward error #303 by @dacorvo
- Fix missing call in transformers models #325 by @dacorvo
- Fix 8-bit mm calls for 4D inputs #326 by @dacorvo
Full Changelog: v0.2.4...v0.2.5