Skip to content

Pull requests: NVIDIA/TensorRT-LLM

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Sort

Pull requests list

th::optional -> std::optional triaged Issue has been triaged by maintainers
#2397 opened Oct 31, 2024 by r-barnes Loading…
add support internvl2 feature request New feature or request triaged Issue has been triaged by maintainers waiting for feedback
#2394 opened Oct 31, 2024 by Jeremy-J-J Loading…
attention mechanism toggle added functionality issue triaged Issue has been triaged by maintainers waiting for feedback
#2384 opened Oct 28, 2024 by Aaryanverma Loading…
fix load_model_on_cpu on qwen/convert_checkpoint.py feature request New feature or request triaged Issue has been triaged by maintainers
#2382 opened Oct 27, 2024 by lkm2835 Loading…
network: fix broken onnx export bug Something isn't working duplicate This issue or pull request already exists Merged triaged Issue has been triaged by maintainers
#2378 opened Oct 25, 2024 by ishandhanani Loading…
Fix errors when using smoothquant to quantize Qwen2 model quantization Issue about lower bit quantization, including int8, int4, fp8 triaged Issue has been triaged by maintainers
#2370 opened Oct 24, 2024 by Missmiaom Loading…
Fix errors when quantizing Llama model quantization Issue about lower bit quantization, including int8, int4, fp8 triaged Issue has been triaged by maintainers
#2264 opened Sep 28, 2024 by dleunji Loading…
fix: none prompt to string waiting for feedback
#2259 opened Sep 26, 2024 by dongs0104 Loading…
README.md: Add 3rd Party Inference Speed Dashboard documentation Improvements or additions to documentation triaged Issue has been triaged by maintainers
#2244 opened Sep 22, 2024 by matichon-vultureprime Loading…
Modify small-batched weight only quantization quantization Issue about lower bit quantization, including int8, int4, fp8 triaged Issue has been triaged by maintainers
#2213 opened Sep 10, 2024 by dasistwo Loading…
Create sync.yml
#2154 opened Aug 27, 2024 by inkimikoko Loading…
typo fix quick-start-guide.md
#2075 opened Aug 1, 2024 by sweetning0809 Loading…
fix GemmFpAIntB MMa::IteratorB::Layout
#2070 opened Jul 31, 2024 by luliyucoordinate Loading…
fix wrong arg in Engine Building Command in docs/source/performance/perf-overview.md documentation Improvements or additions to documentation
#2057 opened Jul 30, 2024 by RuibaiXu Loading…
Fix default min length triaged Issue has been triaged by maintainers
#1935 opened Jul 11, 2024 by akhoroshev Loading…
Add support for custom tokenizer and batch size
#1927 opened Jul 9, 2024 by uppalutkarsh Loading…
Dev sm87 trt101
#1880 opened Jul 3, 2024 by sunnyqgg Loading…
Bump transformers from 4.36.2 to 4.38.0 in /examples/multimodal bug Something isn't working dependencies Pull requests that update a dependency file triaged Issue has been triaged by maintainers waiting for feedback
#1689 opened May 28, 2024 by dependabot bot Loading…
add cached generation buffer triaged Issue has been triaged by maintainers waiting for feedback
#1685 opened May 28, 2024 by michael200892458 Loading…
Fix CUDA OOM when creating Mixtral checkpoint triaged Issue has been triaged by maintainers waiting for feedback
#1629 opened May 19, 2024 by VivekBits2210 Loading…
[feat]: Support weight only gemm with 2bit triaged Issue has been triaged by maintainers waiting for feedback
#1568 opened May 9, 2024 by gavinchen430 Loading…
ProTip! Exclude everything labeled bug with -label:bug.