Error occurs when loading the gemma model in bitsandbytes format. #2557

upskyy · 2024-12-23T09:13:36Z

Motivation

Modifications

I refer to vllm to solve the key error that occurs when loading in bitsandbytes format.

An error occurs when loading the gemma model with the command below.

python3 -m sglang.launch_server --model-path /models --tokenizer-path /models --port 30000 --tokenizer-mode auto --dtype bfloat16 --mem-fraction-static 0.5 --random-seed 0 --enable-torch-compile --disable-cuda-graph --schedule-conservativeness 1.3 --kv-cache-dtype fp8_e5m2 --load-format bitsandbytes --quantization bitsandbytes

Checklist

Format your code according to the Contributor Guide.
Add unit tests as outlined in the Contributor Guide.
Update documentation as needed, including docstrings or example tutorials.

merrymercy · 2024-12-26T15:53:04Z

@upskyy Thanks for the contribution. It is merged. Can you also help us fix this llama model? #2600

upskyy · 2024-12-27T02:04:48Z

@merrymercy
Looking at the code, I think the issue will also be resolved with the PR #2557. This problem occurred when changing to qweight when doing bitsandbytes 4bit load, but in the PR #2557, it was corrected not only for the gemma model but also for loads of other models.

upskyy added 3 commits December 23, 2024 18:08

Add gemma2 model bitsandbytes_stacked_params_mapping

f2545a7

Update logic to load bitsandbytes 4, 8bit weight

55f0747

Apply pre-commit

7522445

upskyy requested review from merrymercy, Ying1123, hnyls2002, zhyncs, ispobock and ByronHsu as code owners December 23, 2024 09:13

merrymercy merged commit 08effbf into sgl-project:main Dec 26, 2024
15 checks passed

upskyy mentioned this pull request Dec 26, 2024

[Bug] Error occurs when loading the gemma model in bitsandbytes format. #2556

Closed

5 tasks

upskyy mentioned this pull request Dec 27, 2024

[Bug] Cannot run bitsandbytes llama models #2600

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Error occurs when loading the gemma model in bitsandbytes format. #2557

Error occurs when loading the gemma model in bitsandbytes format. #2557

upskyy commented Dec 23, 2024 •

edited

Loading

merrymercy commented Dec 26, 2024

upskyy commented Dec 27, 2024

Error occurs when loading the gemma model in bitsandbytes format. #2557

Error occurs when loading the gemma model in bitsandbytes format. #2557

Conversation

upskyy commented Dec 23, 2024 • edited Loading

Motivation

Modifications

Checklist

merrymercy commented Dec 26, 2024

upskyy commented Dec 27, 2024

upskyy commented Dec 23, 2024 •

edited

Loading