Update on the development branch #2216
kaiyux
announced in
Announcements
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hi,
The TensorRT-LLM team is pleased to announce that we have pushed an update to the development branch (and the Triton backend) this Sep 10, 2024.
This update includes:
docs/source/speculative_decoding.md
.ModelRunnerCpp
.openai-gelu
.head_size=48
cases for FMHA kernels.examples/dit/README.md
.LLM
class.executor
API.LLM
class.multi_block_mode
is enabled by default.Thanks,
The TensorRT-LLM Engineering Team
Beta Was this translation helpful? Give feedback.
All reactions