Releases · casper-hansen/AutoAWQ

28 Oct 16:41

github-actions

v0.1.5

b9ed664

v0.1.5

What's Changed

Only apply attention mask if seqlen is greater than 1 by @casper-hansen in #96
add gpt_neox support by @twaka in #113
[core] Support fp32 / bf16 inference by @younesbelkada in #121
Fix potential overflow by @casper-hansen in #102
Fixing starcoder based models with 15B by @SebastianBodza in #118
Support Aquila models. by @ftgreat in #123
Add benchmark of Aquila2 34B AWQ in README.md. by @ftgreat in #126

New Contributors

@twaka made their first contribution in #113
@younesbelkada made their first contribution in #121
@SebastianBodza made their first contribution in #118
@ftgreat made their first contribution in #123

Full Changelog: v0.1.4...v0.1.5

Contributors

twaka, ftgreat, and 3 other contributors

Assets 10

06 Oct 16:05

github-actions

v0.1.4

0baf5e1

v0.1.4

What's Changed

Refactor cache and embedding modules by @casper-hansen in #95
Fix TypeError: 'NoneType' object is not subscriptable

Full Changelog: v0.1.3...v0.1.4

Contributors

casper-hansen

Assets 10

05 Oct 18:05

github-actions

v0.1.3

a6ecd0d

v0.1.3

What's Changed

Turing inference support (Colab+Kaggle working) by @casper-hansen in #92
Fix memory bug (save 2GB VRAM)

Full Changelog: v0.1.2...v0.1.3

Contributors

casper-hansen

Assets 10

02 Oct 18:27

github-actions

v0.1.2

eccb8f9

v0.1.2

What's Changed

Fix unexpected keyword by @casper-hansen in #88
Fix Falcon n_kv_heads parameter by @casper-hansen in #89
Mistral fused modules by @casper-hansen in #90

Full Changelog: v0.1.1...v0.1.2

Contributors

casper-hansen

Assets 10

01 Oct 09:04

github-actions

v0.1.1

3fa7400

v0.1.1

What's Changed

Add GPT BigCode support (StarCoder) by @casper-hansen in #61
Use typing classes over base types by @VikParuchuri in #69
Fix KV cache shapes error by @casper-hansen in #75
Mistral support by @casper-hansen in #79
Add low_cpu_mem_usage=True in example by @casper-hansen in #80
Offloading to cpu and disk by @s4rduk4r in #77
Faster build, fix "no space left". by @casper-hansen in #84

New Contributors

@VikParuchuri made their first contribution in #69
@s4rduk4r made their first contribution in #77

Full Changelog: v0.1.0...v0.1.1

Contributors

VikParuchuri, s4rduk4r, and casper-hansen

Assets 10

21 Sep 11:51

github-actions

v0.1.0

133dd7a

v0.1.0

What's Changed

Support Falcon 180B by @casper-hansen in #35
[NEW] GEMV kernel implementation by @casper-hansen in #40
Allow user to use custom calibration data for quantization by @boehm-e in #27
Safetensors and model sharding by @casper-hansen in #47
2x faster context processing with GEMV by @casper-hansen in #58
Support kv_heads by @casper-hansen in #60
Refactor quantization code by @casper-hansen in #62
support windows by @qwopqwop200 in #53
Improve model loading by @casper-hansen in #66

New Contributors

@boehm-e made their first contribution in #27

Full Changelog: v0.0.2...v0.1.0

Contributors

boehm-e, casper-hansen, and qwopqwop200

Assets 10

06 Sep 20:28

github-actions

v0.0.2

9304af9

v0.0.2

What's Changed

Refactor fused modules by @casper-hansen in #18
fuse_layers bug fix by @qwopqwop200 in #21
support speedtest to benchmark FP16 model by @wanzhenchn in #25
Implement batch size for speed test by @casper-hansen in #26
[BUG] Fix illegal memory access + Quantized Multi-GPU support by @casper-hansen in #28
YaRN support for LLaMa models by @casper-hansen in #23

New Contributors

@wanzhenchn made their first contribution in #25

Full Changelog: v0.0.1...v0.0.2

Contributors

wanzhenchn, casper-hansen, and qwopqwop200

Assets 10

01 Sep 15:34

github-actions

v0.0.1

f0eba43

v0.0.1

What's Changed

Add GPTJ Support by @jamesdborin in #1
windows support by @qwopqwop200 in #16
Release PyPi package + Create GitHub workflow by @casper-hansen in #9

New Contributors

@jamesdborin made their first contribution in #1
@qwopqwop200 made their first contribution in #16
@casper-hansen made their first contribution in #9

Full Changelog: https://github.com/casper-hansen/AutoAWQ/commits/v0.0.1

Contributors

casper-hansen, jamesdborin, and qwopqwop200

Assets 10

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

What's Changed

New Contributors

Contributors

What's Changed

Contributors

What's Changed

Contributors

What's Changed

Contributors

What's Changed

New Contributors

Contributors

What's Changed

New Contributors

Contributors

What's Changed

New Contributors

Contributors

What's Changed

New Contributors

Contributors

Releases: casper-hansen/AutoAWQ

v0.1.5

What's Changed

New Contributors

Contributors

v0.1.4

What's Changed

Contributors

v0.1.3

What's Changed

Contributors

v0.1.2

What's Changed

Contributors

v0.1.1

What's Changed

New Contributors

Contributors

v0.1.0

What's Changed

New Contributors

Contributors

v0.0.2

What's Changed

New Contributors

Contributors

v0.0.1

What's Changed

New Contributors

Contributors