-
Notifications
You must be signed in to change notification settings - Fork 979
Issues: NVIDIA/TensorRT-LLM
[Issue Template]Short one-line summary of the issue #270
#783
opened Jan 1, 2024 by
juney-nvidia
Open
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
trtllm-build ignores Something isn't working
--model_cls_file
and --model_cls_name
bug
#2430
opened Nov 9, 2024 by
abhishekudupa
2 of 4 tasks
trt_build for Llama 3.1 70B fp8 fails with CUDA error
bug
Something isn't working
#2429
opened Nov 8, 2024 by
chrisreese-if
2 of 4 tasks
trt_build for Llama 3.1 70B w4a8 fails with CUDA error
bug
Something isn't working
#2428
opened Nov 8, 2024 by
chrisreese-if
2 of 4 tasks
[TensorRT-LLM][INFO] Initializing MPI with thread mode 3
bug
Something isn't working
#2426
opened Nov 8, 2024 by
Rumeysakeskin
2 of 4 tasks
Small Typo
documentation
Improvements or additions to documentation
triaged
Issue has been triaged by maintainers
#2425
opened Nov 8, 2024 by
MARD1NO
[Question] Document/examples to enable draft model speculative decoding using c++ executor API
question
Further information is requested
triaged
Issue has been triaged by maintainers
#2424
opened Nov 7, 2024 by
ynwang007
[Question] Can I build the tritonserver, tensorrtllm_backend and tensorrtllm and then use these build files across servers?
question
Further information is requested
triaged
Issue has been triaged by maintainers
#2423
opened Nov 7, 2024 by
chrisreese-if
attempt to run benchmark with batch_size>=512 and input_output_len=1024,128 result in tensor volume exceeds 2147483647 error
triaged
Issue has been triaged by maintainers
waiting for feedback
#2422
opened Nov 7, 2024 by
dmonakhov
support FLUX?
feature request
New feature or request
triaged
Issue has been triaged by maintainers
#2421
opened Nov 7, 2024 by
algorithmconquer
Assertion failed: Must set crossKvCacheFraction for encoder-decoder model
bug
Something isn't working
Investigating
triaged
Issue has been triaged by maintainers
#2419
opened Nov 6, 2024 by
Saeedmatt3r
2 of 4 tasks
CUDA runtime error in cudaMemcpyAsync when enabling kv cache reuse with prompt table and TP > 1.
bug
Something isn't working
Investigating
triaged
Issue has been triaged by maintainers
#2417
opened Nov 6, 2024 by
jxchenus
2 of 4 tasks
ModuleNotFoundError: No module named 'tensorrt_llm.bindings'
installation
triaged
Issue has been triaged by maintainers
waiting for feedback
#2416
opened Nov 6, 2024 by
DeekshithaDPrakash
Request for Colbert Model
new model
question
Further information is requested
triaged
Issue has been triaged by maintainers
#2415
opened Nov 5, 2024 by
FernandoDorado
Exporting Finetuned Llama models to TensorRT-LLM
question
Further information is requested
triaged
Issue has been triaged by maintainers
waiting for feedback
#2412
opened Nov 5, 2024 by
DeekshithaDPrakash
Consistent Output with Same Prompts
question
Further information is requested
triaged
Issue has been triaged by maintainers
#2411
opened Nov 4, 2024 by
ZhenboYan
Question: How do Further information is requested
triaged
Issue has been triaged by maintainers
enable_context_fmha
and use_paged_context_fmha
work?
question
#2408
opened Nov 2, 2024 by
dontloo
run.py --run_profiling respects stop token and is unsuitable for performance comparisons
question
Further information is requested
triaged
Issue has been triaged by maintainers
waiting for feedback
#2407
opened Nov 2, 2024 by
aikitoria
2 of 4 tasks
logprobs always 0.000
bug
Something isn't working
Investigating
triaged
Issue has been triaged by maintainers
#2406
opened Nov 1, 2024 by
mmoskal
2 of 4 tasks
Segmentation fault (11) on 1022dev+TRT 10.4.0
bug
Something isn't working
triaged
Issue has been triaged by maintainers
waiting for feedback
#2402
opened Nov 1, 2024 by
aliencaocao
2 of 4 tasks
T5 out of memory
bug
Something isn't working
triaged
Issue has been triaged by maintainers
#2398
opened Oct 31, 2024 by
ydm-amazon
4 tasks
Why is the performance worse than release 0.12.0 when I run the benchmark of release 0.13.0
performance issue
Issue about performance number
triaged
Issue has been triaged by maintainers
waiting for feedback
#2395
opened Oct 31, 2024 by
rexmxw02
1 of 4 tasks
Qwen2-72B w4a8 empty output
bug
Something isn't working
quantization
Issue about lower bit quantization, including int8, int4, fp8
triaged
Issue has been triaged by maintainers
waiting for feedback
#2392
opened Oct 30, 2024 by
lishicheng1996
2 of 4 tasks
Qwen2-1.5B-Instruct convert_checkpoint.py failed
bug
Something isn't working
triaged
Issue has been triaged by maintainers
waiting for feedback
#2388
opened Oct 29, 2024 by
1994
2 of 4 tasks
Previous Next
ProTip!
Mix and match filters to narrow down what you’re looking for.