Key Changes
- Enable observability in JetStream Server (prometheus metrics)
- Enable JAX profiler support on single-host JetStream Server
- Support both text and token ids I/O for JetStream Decode API
- Add health check API
- Support MLPerf evaluation
- Enable JetStream Server E2E tests
- Increase unit test coverage (>=96%)
What's Changed
- Accuracy eval mlperf by @jwyang-google in #76
- Add metadata metrics by @yeandy in #77
- Fix pad_tokens function description by @FanhaiLu1 in #80
- Prometheus Metrics by @Bslabe123 in #71
- Update JetStream grpc proto to support I/O with text and token ids by @JoeZijunZhou in #78
- Update benchmark script to easily test llama-3 by @bhavya01 in #83
- Unit test coverage cleanup by @JoeZijunZhou in #81
- Allow tokenizer to customize stop_tokens by @qihqi in #84
- Decode Batch Percentage Metrics/Improved Scraping by @Bslabe123 in #82
- Bump requests from 2.31.0 to 2.32.0 in the pip group across 1 directory by @dependabot in #86
- Add profiling support and update docs by @JoeZijunZhou in #85
- Add ray disaggregated serving support by @FanhaiLu1 in #87
- Ensure server warmup before benchmark by @JoeZijunZhou in #91
- Add healthcheck support for JetStream by @vivianrwu in #90
- Add JetStream E2E test CI by @JoeZijunZhou in #89
- Release v0.2.2 by @JoeZijunZhou in #95
New Contributors
- @jwyang-google made their first contribution in #76
- @Bslabe123 made their first contribution in #71
- @vivianrwu made their first contribution in #90
Full Changelog: v0.2.1...v0.2.2