Releases: AI-Hypercomputer/jetstream-pytorch
Releases · AI-Hypercomputer/jetstream-pytorch
jetstream-v0.2.3
What's Changed
- Enable jax profiler server in run with ray by @FanhaiLu1 in #112
- Add for readme interleave multiple host with ray by @FanhaiLu1 in #114
- Fix conversion bug by @yeandy in #116
- Integrate disaggregated serving with JetStream by @FanhaiLu1 in #117
- Support HF LLaMA ckpt conversion by @lsy323 in #118
- Add guide on adding HF ckpt conversion support by @lsy323 in #119
- Add support for Llama3-70b by @bhavya01 in #101
- Fix convert_checkpoint.py for hf and gemma by @qihqi in #121
- Mixtral enablement. by @wang2yn84 in #120
- add script to isntall for GPU by @qihqi in #122
- Add activation quantization support to per-channel quantized linear layers by @lsy323 in #105
- Remove JSON config mangling for Gemma ckpt by @lsy323 in #124
- Add different token sampling algorithms to decoder. by @bvrockwell in #123
- Add lock in prefill and generate to prevent starvation by @FanhaiLu1 in #126
- Update submodules, prepare for leasing v0.2.4 by @qihqi in #127
- Update README.md by @qihqi in #128
- Update summary.md by @qihqi in #125
- Update README.md by @bhavya01 in #129
- make sure GPU works by @qihqi in #130
New Contributors
- @yeandy made their first contribution in #116
- @bvrockwell made their first contribution in #123
Full Changelog: jetstream-v0.2.2...jetstream-v0.2.3
jetstream-v0.2.2
jetstream-pytorch 0.2.2
- Miscellaneous bug fixes.
- Support of Tiktoken tokenizer
- Support for Gemma - 2b model (running data parallel)
jetstream-v0.2.1
Key Changes
- Support Llama3
- Support Gemma
- Ray Multiple Host Single Pod Slice MVP
- Enable unit test and format check
jetstream-v0.2.0
Release JetStream Pytorch with JetStream v0.2.0 for inference