Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Latest working version koboldcpp-1.75.2 #1243

Open
investing0 opened this issue Nov 30, 2024 · 5 comments
Open

Latest working version koboldcpp-1.75.2 #1243

investing0 opened this issue Nov 30, 2024 · 5 comments

Comments

@investing0
Copy link

investing0 commented Nov 30, 2024

1.79 Vulcan multigpu does not work, the answer is gibberish,
and in all versions from 1.76 to 1.79 when closing the terminal, a blue screen.
Windows 11, rtx3090 +rx6800

Latest working version koboldcpp-1.75.2

I checked each (3090 and 6800) to launch Vulcan, they work fine without gibberish and a blue screen

I followed the task manager, after closing the terminal, the computer works for a couple more seconds, the blue screen occurs when unloading from video cards begins

@LostRuins
Copy link
Owner

I have noticed myself getting BSODs too, and @0cc4m thinks it could be a driver problem on windows.

Seems like in some cases the graphics memory gets corrupted/overwritten on vulkan.

Try this: reduce the number of offloaded layers by about 30%. Does it solve the problem?

@investing0
Copy link
Author

investing0 commented Dec 1, 2024

Try this: reduce the number of offloaded layers by about 30%. Does it solve the problem? - Yes, everything is so, yesterday I didn't notice that everything worked in 1.79 on the vulcan on both video cards when testing 9b glm, now on 3090 32b qwen on all memory and again a blue screen, now I tried 50 layers for llama 70 3.1 3km, there is no blue screen, but in multigpu gibberish remains.

Also, I checked the RX 6800, 14GB out of 16 is occupied, GLM4 49K context, no blue screen, works well

@aleksusklim
Copy link

With 1.79.1, recently I got blue screens twice in a row with SYSTEM_SERVICE_EXCEPTION (without any other information like driver name or memory address) after loading two copies of Mistral Large (Q5 with CuBLAS and 0 offloaded layers) at the same time in two instances of koboldcpp (once with drafting on RTX 3060, and once without).
I could do this previously without problems, since I have 128 Gb of RAM.

Is it worth to identify why this happens? I can't confirm that my system environment hasn't changed and that I didn't install any other software, because most probably I did.

So far, this does not happen when running only one Mistral Large (and even with Gemma 22B together).

@LostRuins
Copy link
Owner

Sounds like a graphics driver bug. What does the windows Crash Dump say? (You can use BlueScreenView or WhoCrashed to get more info)

I have had BSOD with vulkan too but not cuda. Generally happens during allocation/deallocation when too much memory is used, somehow it gets corrupted.

Have you tried updating your Nvidia drivers?

If there is a fix, it would be up to the driver developer.

@aleksusklim
Copy link

aleksusklim commented Dec 10, 2024

Tried BlueScreenView (right after the first BSoD), absolutely nothing, and no crashdump file as if it was not even created.

Probably I have to figure out why there are no crashdumps (maybe I have some setting somewhere that prevents them from being written altogether?)

Okay, I'll try to investigate further.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants