Skip to content

Pull requests: vllm-project/vllm

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Sort

Pull requests list

[6/N] pass whole config to inner model
#10205 opened Nov 11, 2024 by youkaichao Loading…
[Core] Add RunAI Model Streamer as optional loader. ci/build documentation Improvements or additions to documentation
#10192 opened Nov 10, 2024 by omer-dayan Loading…
[Model] Add support for Qwen2 for embeddings documentation Improvements or additions to documentation ready ONLY add when PR is ready to merge/full CI is needed
#10184 opened Nov 9, 2024 by DarkLight1337 Loading…
3
7
Add docs on serving with Llama Stack documentation Improvements or additions to documentation
#10183 opened Nov 9, 2024 by terrytangyuan Loading…
[Docs] Misc updates to TPU installation instructions documentation Improvements or additions to documentation
#10165 opened Nov 8, 2024 by mikegre-google Loading…
[Doc] Move PR template content to docs ci/build documentation Improvements or additions to documentation
#10159 opened Nov 8, 2024 by russellb Loading…
Fix missing data type in flashinfer prefill
#10141 opened Nov 8, 2024 by reyoung Loading…
[Kernel]Enable HPU for Speculative Decoding
#10131 opened Nov 7, 2024 by xuechendi Loading…
[WIP] Prefix Cache Aware Scheduling [1/n]
#10128 opened Nov 7, 2024 by rickyyx Loading…
[V1][Bugfix] Propagate V1 LLMEngine properly ready ONLY add when PR is ready to merge/full CI is needed
#10127 opened Nov 7, 2024 by comaniac Loading…
Fix quantization config of vl model
#10120 opened Nov 7, 2024 by jinzhen-lin Loading…
ProTip! Filter pull requests by the default branch with base:main.