Skip to content

Issues: NVIDIA/TensorRT-LLM

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Assignee
Filter by who’s assigned
Sort

Issues list

trtllm-build ignores --model_cls_file and --model_cls_name bug Something isn't working
#2430 opened Nov 9, 2024 by abhishekudupa
2 of 4 tasks
trt_build for Llama 3.1 70B fp8 fails with CUDA error bug Something isn't working
#2429 opened Nov 8, 2024 by chrisreese-if
2 of 4 tasks
trt_build for Llama 3.1 70B w4a8 fails with CUDA error bug Something isn't working
#2428 opened Nov 8, 2024 by chrisreese-if
2 of 4 tasks
[TensorRT-LLM][INFO] Initializing MPI with thread mode 3 bug Something isn't working
#2426 opened Nov 8, 2024 by Rumeysakeskin
2 of 4 tasks
Small Typo documentation Improvements or additions to documentation triaged Issue has been triaged by maintainers
#2425 opened Nov 8, 2024 by MARD1NO
[Question] Document/examples to enable draft model speculative decoding using c++ executor API question Further information is requested triaged Issue has been triaged by maintainers
#2424 opened Nov 7, 2024 by ynwang007
support FLUX? feature request New feature or request triaged Issue has been triaged by maintainers
#2421 opened Nov 7, 2024 by algorithmconquer
qwen 2-1.5B model build error bug Something isn't working duplicate This issue or pull request already exists triaged Issue has been triaged by maintainers
#2420 opened Nov 6, 2024 by rexmxw02
4 tasks
Assertion failed: Must set crossKvCacheFraction for encoder-decoder model bug Something isn't working Investigating triaged Issue has been triaged by maintainers
#2419 opened Nov 6, 2024 by Saeedmatt3r
2 of 4 tasks
CUDA runtime error in cudaMemcpyAsync when enabling kv cache reuse with prompt table and TP > 1. bug Something isn't working Investigating triaged Issue has been triaged by maintainers
#2417 opened Nov 6, 2024 by jxchenus
2 of 4 tasks
Request for Colbert Model new model question Further information is requested triaged Issue has been triaged by maintainers
#2415 opened Nov 5, 2024 by FernandoDorado
Exporting Finetuned Llama models to TensorRT-LLM question Further information is requested triaged Issue has been triaged by maintainers waiting for feedback
#2412 opened Nov 5, 2024 by DeekshithaDPrakash
Consistent Output with Same Prompts question Further information is requested triaged Issue has been triaged by maintainers
#2411 opened Nov 4, 2024 by ZhenboYan
Question: How do enable_context_fmha and use_paged_context_fmha work? question Further information is requested triaged Issue has been triaged by maintainers
#2408 opened Nov 2, 2024 by dontloo
run.py --run_profiling respects stop token and is unsuitable for performance comparisons question Further information is requested triaged Issue has been triaged by maintainers waiting for feedback
#2407 opened Nov 2, 2024 by aikitoria
2 of 4 tasks
logprobs always 0.000 bug Something isn't working Investigating triaged Issue has been triaged by maintainers
#2406 opened Nov 1, 2024 by mmoskal
2 of 4 tasks
Segmentation fault (11) on 1022dev+TRT 10.4.0 bug Something isn't working triaged Issue has been triaged by maintainers waiting for feedback
#2402 opened Nov 1, 2024 by aliencaocao
2 of 4 tasks
T5 out of memory bug Something isn't working triaged Issue has been triaged by maintainers
#2398 opened Oct 31, 2024 by ydm-amazon
4 tasks
Why is the performance worse than release 0.12.0 when I run the benchmark of release 0.13.0 performance issue Issue about performance number triaged Issue has been triaged by maintainers waiting for feedback
#2395 opened Oct 31, 2024 by rexmxw02
1 of 4 tasks
Qwen2-72B w4a8 empty output bug Something isn't working quantization Issue about lower bit quantization, including int8, int4, fp8 triaged Issue has been triaged by maintainers waiting for feedback
#2392 opened Oct 30, 2024 by lishicheng1996
2 of 4 tasks
Qwen2-1.5B-Instruct convert_checkpoint.py failed bug Something isn't working triaged Issue has been triaged by maintainers waiting for feedback
#2388 opened Oct 29, 2024 by 1994
2 of 4 tasks
ProTip! Mix and match filters to narrow down what you’re looking for.