vllm-project / vllm Public

Notifications You must be signed in to change notification settings
Fork 5k
Star 32.9k

Code
Issues 1.2k
Pull requests 441
Discussions
Actions
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Security
Insights

Issues: vllm-project/vllm

[Roadmap] vLLM Roadmap Q4 2024

#9006 opened Oct 1, 2024 by simon-mo

Open 26

vLLM's V1 Engine Architecture

#8779 opened Sep 24, 2024 by simon-mo

Open 10

Labels 56 Milestones 0

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

1,233 Open 4,553 Closed

Author

Filter by author

Label

Filter by label

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Milestones

Filter by milestone

Assignee

Filter by who’s assigned

Assigned to nobody

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Issues list

[Bug]: Continuous batching (OpenAI Server) with greedy search return different results bug

Something isn't working

#11658 opened Dec 31, 2024 by thangld201

1 task done

[Bug]: I started a qwen2vl-7b video processing service using vllm (0.6.6), but encountered an error during inference bug

Something isn't working

#11657 opened Dec 31, 2024 by hyyuananran

1 task done

[Feature]: Support Inflight quantization: load as 8bit quantization. feature request

#11655 opened Dec 31, 2024 by ShelterWFF

1 task done

[Bug]: Getting error 500 while requesting to /v1/completions bug

Something isn't working

#11654 opened Dec 31, 2024 by mamadmolla0

1 task done

[Bug]: NotImplementedError: No operator found for memory_efficient_attention_forward bug

Something isn't working

#11653 opened Dec 31, 2024 by AnthonyX1an

1 task done

[Usage]: I would like to know how to transfer fps and max_pixels after starting a qwen2vl-7b service using vllm? usage

How to use vllm

#11652 opened Dec 31, 2024 by hyyuananran

1 task done

[Usage]: multinode inference: assert value == world_size, f"Expected {world_size}, got {value}" AssertionError: Expected 16, got 1.0 usage

How to use vllm

#11651 opened Dec 31, 2024 by mangomatrix

1 task done

[New Model]: command-r7b new model

Requests to new models

#11650 opened Dec 31, 2024 by Marcher-lam

1 task done

[Performance]: V1 vs V0 with multi-steps performance

Performance-related issues

#11649 opened Dec 31, 2024 by Desmond819

1 task done

[Bug]: I try to use vllm==0.6.5 for GLM4-9b-chat but error "/usr/bin/ld: cannot find -lcuda" bug

Something isn't working

#11643 opened Dec 31, 2024 by Jimmy-L99

1 task done

[Bug]: Llama-3.1-Nemotron-70B-Instruct-HF W8A8 has ValueError: Failed to invert hessian due to numerical instability. bug

Something isn't working

#11641 opened Dec 31, 2024 by fan-niu

1 task done

[Feature]: Avoid KV Cache and offload Model weights in RL workloads feature request

#11638 opened Dec 30, 2024 by PeterSH6

1 task done

[Usage][V1]: how to get logprobs for V1 engine? usage

How to use vllm

#11634 opened Dec 30, 2024 by CypherSavage

[Feature]: can use cudagraph in prefill stage? feature request

#11628 opened Dec 30, 2024 by MichoChan

1 task done

[Usage]: async stream error usage

How to use vllm

#11627 opened Dec 30, 2024 by lxb1202

1 task done

[Bug]: When I use vllm to start deepseek-ai/DeepSeek-V2-Lite-Chat model inference, the error "deepseek-ai/DeepSeek-V2-Lite-Chat" is reported. My vllm version is 0.6.3，help me bro bug

Something isn't working

#11626 opened Dec 30, 2024 by CNDotaBest

1 task done

[Installation]: Request to include vllm==0.6.3.post1 for cuda 11.8 installation

Installation problems

#11623 opened Dec 30, 2024 by jxqhhh

1 task done

[Usage]: Can AsyncEngineArgs load multiple lora modules？ usage

How to use vllm

#11621 opened Dec 30, 2024 by Jimmy-L99

1 task done

vllm build failure on IBM ppc64le installation

Installation problems

#11616 opened Dec 30, 2024 by npanpaliya

1 task done

[Installation]: Hitting issues while trying to build vllm image using Dockerfile.rocm (v0.6.2) installation

Installation problems

#11615 opened Dec 30, 2024 by sakshiarora13

1 task done

[Bug]: Nvidia DALI and VLLM bug

Something isn't working

#11611 opened Dec 30, 2024 by conceptofmind

1 task done

[Bug]: Can Not load model Qwen2-VL-72B-Instruct in Vllm bug

Something isn't working

#11608 opened Dec 30, 2024 by Tian14267

1 task done

[Feature]: Confidence score for Qwen/Qwen2-VL-7B-Instruct feature request

#11606 opened Dec 29, 2024 by Dineshkumar-Anandan-ZS0367

1 task done

[Bug]: AsyncEngine Backend loop is stopped bug

Something isn't working

#11603 opened Dec 29, 2024 by DongZhaoXiong

1 task done

[Bug]: can not run with OpenGVLab/InternVL2_5-78B-MPO-AWQ bug

Something isn't working

#11601 opened Dec 29, 2024 by bltcn

1 task done

Previous 1 2 3 4 5 … 49 50 Next

Previous Next

ProTip! Type g p on any issue or pull request to go back to the pull request listing page.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly