-
-
Notifications
You must be signed in to change notification settings - Fork 4.5k
Pull requests: vllm-project/vllm
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
[v1][torch.compile] manage cudagraph buffer in compiler
#10203
opened Nov 10, 2024 by
youkaichao
Loading…
[Bugfix] bitsandbytes models fail to run pipeline parallel
#10200
opened Nov 10, 2024 by
HoangCongDuc
Loading…
[Bugfix][SpecDecode] apply sampling parameters to target probabilities for consistency in rejection sampling.
#10198
opened Nov 10, 2024 by
jeongin601
Loading…
[Core] Add RunAI Model Streamer as optional loader.
ci/build
documentation
Improvements or additions to documentation
#10192
opened Nov 10, 2024 by
omer-dayan
Loading…
[Model] Add support for Qwen2 for embeddings
documentation
Improvements or additions to documentation
ready
ONLY add when PR is ready to merge/full CI is needed
#10184
opened Nov 9, 2024 by
DarkLight1337
Loading…
Add docs on serving with Llama Stack
documentation
Improvements or additions to documentation
#10183
opened Nov 9, 2024 by
terrytangyuan
Loading…
[Bug]: When apply continue_final_message for OpenAI server, the "echo":false is ignored
frontend
#10180
opened Nov 9, 2024 by
chaunceyjiang
Loading…
[Frontend] Add per-request number of cached token stats
frontend
#10174
opened Nov 9, 2024 by
zifeitong
Loading…
[Docs] Misc updates to TPU installation instructions
documentation
Improvements or additions to documentation
#10165
opened Nov 8, 2024 by
mikegre-google
Loading…
[Bugfix][Frontend] Update Llama 3.2 Chat Template to support Vision and Non-Tool use
#10164
opened Nov 8, 2024 by
tjohnson31415
Loading…
[Doc] Move PR template content to docs
ci/build
documentation
Improvements or additions to documentation
#10159
opened Nov 8, 2024 by
russellb
Loading…
[Feature] [Spec decode]: Enable MLPSpeculator/Medusa and
prompt_logprobs
with ChunkedPrefill
needs-rebase
#10132
opened Nov 7, 2024 by
NickLucche
•
Draft
1 task
[V1][Bugfix] Propagate V1 LLMEngine properly
ready
ONLY add when PR is ready to merge/full CI is needed
#10127
opened Nov 7, 2024 by
comaniac
Loading…
[Core] Add padding-aware scheduling for 2D prefills
#10125
opened Nov 7, 2024 by
kzawora-intel
Loading…
[Hardware][CPU][torch.compile] integrate torch compile
needs-rebase
#10113
opened Nov 7, 2024 by
bigPYJ1151
•
Draft
Previous Next
ProTip!
Filter pull requests by the default branch with base:main.