vllm-project / vllm Public

Notifications You must be signed in to change notification settings
Fork 5k
Star 32.9k

Code
Issues 1.2k
Pull requests 441
Discussions
Actions
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Security
Insights

Pull requests: vllm-project/vllm

Labels 56 Milestones 0

New pull request New

441 Open 4,955 Closed

Author

Filter by author

Label

Filter by label

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Milestones

Filter by milestone

Reviews

Filter by reviews

No reviews Review required Approved review Changes requested

Assignee

Filter by who’s assigned

Assigned to nobody

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Pull requests list

[Core] Rank-to-device mapping env var

#11662 opened Dec 31, 2024 by BKitor

Loading…

[VLM] Add max-count checking in data parser for single image models documentation

Improvements or additions to documentation

#11661 opened Dec 31, 2024 by DarkLight1337

Loading…

[Bugfix][Refactor] Unify model management in frontend frontend

#11660 opened Dec 31, 2024 by joerunde

Loading…

[V1] Simplify Shutdown frontend ready

ONLY add when PR is ready to merge/full CI is needed

#11659 opened Dec 31, 2024 by robertgshaw2-neuralmagic

Loading…

log GPU blocks num for MultiprocExecutor

#11656 opened Dec 31, 2024 by WangErXiao

Loading…

[XPU] Make pp group initilized for pipeline-parallelism

#11648 opened Dec 31, 2024 by ys950902

Loading…

[Doc] [1/N] Reorganize Getting Started section documentation

Improvements or additions to documentation

#11645 opened Dec 31, 2024 by DarkLight1337

Loading…

[Docs] reorganize sponsorship page documentation

Improvements or additions to documentation

#11639 opened Dec 30, 2024 by simon-mo

Loading…

[V1] [7/N] API Server: Multiprocessing Detokenizer [ DO NOT MERGE ]

#11636 opened Dec 30, 2024 by robertgshaw2-neuralmagic

Loading…

[V1] Implement Cascade Attention ci/build

#11635 opened Dec 30, 2024 by WoosukKwon

Loading…

[Kernel] Support MulAndSilu

#11624 opened Dec 30, 2024 by jeejeelee

Loading…

[Quantization/Parameter] WIP: Replace parameter subclasses with raw nn.Parameter with additional attributes

#11622 opened Dec 30, 2024 by cennn

Loading…

[torch.compile] consider relevant code in compilation cache

#11614 opened Dec 30, 2024 by youkaichao

Loading…

[Do Not Merge] - LoRA V1 Reference PR needs-rebase

#11613 opened Dec 30, 2024 by varun-sundar-rabindranath • Draft

Bump helm/kind-action from 1.10.0 to 1.12.0 ci/build dependencies

Pull requests that update a dependency file

github_actions

Pull requests that update GitHub Actions code

#11612 opened Dec 30, 2024 by dependabot bot

Loading…

[platform] Allow platform specify attention backend

#11609 opened Dec 30, 2024 by wangxiyuan

Loading…

[V1] 7/N API Server: Update LM-Eval To Use Streaming ci/build ready

ONLY add when PR is ready to merge/full CI is needed

#11590 opened Dec 28, 2024 by robertgshaw2-neuralmagic

Loading…

[Kernel] Triton Configs for Fp8 Block Quantization

#11589 opened Dec 28, 2024 by robertgshaw2-neuralmagic

Loading…

[Bugfix] Reduce prefix prefill block size for Pascal

#11584 opened Dec 28, 2024 by sasha0552

Loading…

[misc] Add LoRA kernel micro benchmarks

#11579 opened Dec 28, 2024 by varun-sundar-rabindranath

Loading…

[WIP][Model] Initialize support for Deepseek-VL2 models documentation

Improvements or additions to documentation

frontend needs-rebase

#11578 opened Dec 28, 2024 by Isotr0py • Draft

[Misc] Minimum requirements for SageMaker compatibility ci/build frontend

#11576 opened Dec 28, 2024 by nathan-az

Loading…

[Draft] Update benchmark_moe.py to use block wise quant for Deepseek V3

#11574 opened Dec 28, 2024 by simon-mo • Draft

[Frontend] Improve Error Handling documentation

Improvements or additions to documentation

frontend needs-rebase

#11570 opened Dec 27, 2024 by robertgshaw2-neuralmagic

Loading…

[Model] LoRA with lm_head and embed_tokens fully trained - 3

#11558 opened Dec 27, 2024 by sergeykochetkov

Loading…

Previous 1 2 3 4 5 … 17 18 Next

Previous Next

ProTip! no:milestone will show everything without a milestone.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly