-
-
Notifications
You must be signed in to change notification settings - Fork 5k
Pull requests: vllm-project/vllm
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
[VLM] Add max-count checking in data parser for single image models
documentation
Improvements or additions to documentation
#11661
opened Dec 31, 2024 by
DarkLight1337
Loading…
[Bugfix][Refactor] Unify model management in frontend
frontend
#11660
opened Dec 31, 2024 by
joerunde
Loading…
[V1] Simplify Shutdown
frontend
ready
ONLY add when PR is ready to merge/full CI is needed
#11659
opened Dec 31, 2024 by
robertgshaw2-neuralmagic
Loading…
[XPU] Make pp group initilized for pipeline-parallelism
#11648
opened Dec 31, 2024 by
ys950902
Loading…
[Doc] [1/N] Reorganize Getting Started section
documentation
Improvements or additions to documentation
#11645
opened Dec 31, 2024 by
DarkLight1337
Loading…
[Docs] reorganize sponsorship page
documentation
Improvements or additions to documentation
#11639
opened Dec 30, 2024 by
simon-mo
Loading…
[V1] [7/N] API Server: Multiprocessing Detokenizer [ DO NOT MERGE ]
#11636
opened Dec 30, 2024 by
robertgshaw2-neuralmagic
Loading…
[Quantization/Parameter] WIP: Replace parameter subclasses with raw nn.Parameter with additional attributes
#11622
opened Dec 30, 2024 by
cennn
Loading…
[torch.compile] consider relevant code in compilation cache
#11614
opened Dec 30, 2024 by
youkaichao
Loading…
[Do Not Merge] - LoRA V1 Reference PR
needs-rebase
#11613
opened Dec 30, 2024 by
varun-sundar-rabindranath
•
Draft
Bump helm/kind-action from 1.10.0 to 1.12.0
ci/build
dependencies
Pull requests that update a dependency file
github_actions
Pull requests that update GitHub Actions code
#11612
opened Dec 30, 2024 by
dependabot
bot
Loading…
[platform] Allow platform specify attention backend
#11609
opened Dec 30, 2024 by
wangxiyuan
Loading…
[V1] 7/N API Server: Update LM-Eval To Use Streaming
ci/build
ready
ONLY add when PR is ready to merge/full CI is needed
#11590
opened Dec 28, 2024 by
robertgshaw2-neuralmagic
Loading…
[Kernel] Triton Configs for Fp8 Block Quantization
#11589
opened Dec 28, 2024 by
robertgshaw2-neuralmagic
Loading…
[Bugfix] Reduce prefix prefill block size for Pascal
#11584
opened Dec 28, 2024 by
sasha0552
Loading…
[misc] Add LoRA kernel micro benchmarks
#11579
opened Dec 28, 2024 by
varun-sundar-rabindranath
Loading…
[WIP][Model] Initialize support for Deepseek-VL2 models
documentation
Improvements or additions to documentation
frontend
needs-rebase
[Frontend] Improve Error Handling
documentation
Improvements or additions to documentation
frontend
needs-rebase
#11570
opened Dec 27, 2024 by
robertgshaw2-neuralmagic
Loading…
[Model] LoRA with lm_head and embed_tokens fully trained - 3
#11558
opened Dec 27, 2024 by
sergeykochetkov
Loading…
Previous Next
ProTip!
no:milestone will show everything without a milestone.