-
-
Notifications
You must be signed in to change notification settings - Fork 5k
Issues: vllm-project/vllm
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
[Bug]: Continuous batching (OpenAI Server) with greedy search return different results
bug
Something isn't working
#11658
opened Dec 31, 2024 by
thangld201
1 task done
[Bug]: I started a qwen2vl-7b video processing service using vllm (0.6.6), but encountered an error during inference
bug
Something isn't working
#11657
opened Dec 31, 2024 by
hyyuananran
1 task done
[Feature]: Support Inflight quantization: load as 8bit quantization.
feature request
#11655
opened Dec 31, 2024 by
ShelterWFF
1 task done
[Bug]: Getting error 500 while requesting to Something isn't working
/v1/completions
bug
#11654
opened Dec 31, 2024 by
mamadmolla0
1 task done
[Bug]: NotImplementedError: No operator found for memory_efficient_attention_forward
bug
Something isn't working
#11653
opened Dec 31, 2024 by
AnthonyX1an
1 task done
[Usage]: I would like to know how to transfer fps and max_pixels after starting a qwen2vl-7b service using vllm?
usage
How to use vllm
#11652
opened Dec 31, 2024 by
hyyuananran
1 task done
[Usage]: multinode inference: assert value == world_size, f"Expected {world_size}, got {value}" AssertionError: Expected 16, got 1.0
usage
How to use vllm
#11651
opened Dec 31, 2024 by
mangomatrix
1 task done
[New Model]: command-r7b
new model
Requests to new models
#11650
opened Dec 31, 2024 by
Marcher-lam
1 task done
[Performance]: V1 vs V0 with multi-steps
performance
Performance-related issues
#11649
opened Dec 31, 2024 by
Desmond819
1 task done
[Bug]: I try to use vllm==0.6.5 for GLM4-9b-chat but error "/usr/bin/ld: cannot find -lcuda"
bug
Something isn't working
#11643
opened Dec 31, 2024 by
Jimmy-L99
1 task done
[Bug]: Llama-3.1-Nemotron-70B-Instruct-HF W8A8 has ValueError: Failed to invert hessian due to numerical instability.
bug
Something isn't working
#11641
opened Dec 31, 2024 by
fan-niu
1 task done
[Feature]: Avoid KV Cache and offload Model weights in RL workloads
feature request
#11638
opened Dec 30, 2024 by
PeterSH6
1 task done
[Usage][V1]: how to get logprobs for V1 engine?
usage
How to use vllm
#11634
opened Dec 30, 2024 by
CypherSavage
[Feature]: can use cudagraph in prefill stage?
feature request
#11628
opened Dec 30, 2024 by
MichoChan
1 task done
[Bug]: When I use vllm to start deepseek-ai/DeepSeek-V2-Lite-Chat model inference, the error "deepseek-ai/DeepSeek-V2-Lite-Chat" is reported. My vllm version is 0.6.3,help me bro
bug
Something isn't working
#11626
opened Dec 30, 2024 by
CNDotaBest
1 task done
[Installation]: Request to include vllm==0.6.3.post1 for cuda 11.8
installation
Installation problems
#11623
opened Dec 30, 2024 by
jxqhhh
1 task done
[Usage]: Can AsyncEngineArgs load multiple lora modules?
usage
How to use vllm
#11621
opened Dec 30, 2024 by
Jimmy-L99
1 task done
vllm build failure on IBM ppc64le
installation
Installation problems
#11616
opened Dec 30, 2024 by
npanpaliya
1 task done
[Installation]: Hitting issues while trying to build vllm image using Dockerfile.rocm (v0.6.2)
installation
Installation problems
#11615
opened Dec 30, 2024 by
sakshiarora13
1 task done
[Bug]: Nvidia DALI and VLLM
bug
Something isn't working
#11611
opened Dec 30, 2024 by
conceptofmind
1 task done
[Bug]: Can Not load model Qwen2-VL-72B-Instruct in Vllm
bug
Something isn't working
#11608
opened Dec 30, 2024 by
Tian14267
1 task done
[Feature]: Confidence score for Qwen/Qwen2-VL-7B-Instruct
feature request
#11606
opened Dec 29, 2024 by
Dineshkumar-Anandan-ZS0367
1 task done
[Bug]: AsyncEngine Backend loop is stopped
bug
Something isn't working
#11603
opened Dec 29, 2024 by
DongZhaoXiong
1 task done
[Bug]: can not run with OpenGVLab/InternVL2_5-78B-MPO-AWQ
bug
Something isn't working
#11601
opened Dec 29, 2024 by
bltcn
1 task done
Previous Next
ProTip!
Type g p on any issue or pull request to go back to the pull request listing page.