Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

30X Slowdown from 0.3.6 to 0.3.9 #391

Open
astrologos opened this issue Feb 2, 2025 · 1 comment
Open

30X Slowdown from 0.3.6 to 0.3.9 #391

astrologos opened this issue Feb 2, 2025 · 1 comment
Labels
more-info-needed Need more information to diagnose the problem

Comments

@astrologos
Copy link

Which version of LM Studio?
Example: LM Studio 0.3.9 Build 6

Which operating system?
Windows

What is the bug?
30X slowdown to 1 tok/sec after update from 0.3.6B8 to 0.3.9B6 on Intel B580 GPU

Logs
see attached

To Reproduce
Observe initial speed using 3.6 Build 8 with Vulkan engine with model DeepSeek-R1-Distill-Qwen-14B-GGUF/DeepSeek-R1-Distill-Qwen-14B-Q4_K_M.gguf
30.1 tok/s with full GPU offload

Image

Update to from 3.6 Build 8 to 3.9 using Arc B580
Observe initial speed using 3.9 Build 6 with Vulkan engine with model DeepSeek-R1-Distill-Qwen-14B-GGUF/DeepSeek-R1-Distill-Qwen-14B-Q4_K_M.gguf
1.2 tok/s with full GPU offload

Problem has been recreated by a colleague using NVIDIA hardware.

What's going on?? And why can't I revert to the previous version??

main.log

@yagil
Copy link
Member

yagil commented Feb 2, 2025

Thanks for the bug report.

Can you please:

  1. Share a screenshot of Ctrl + Shift + R
  2. In addition, share a screenshot of Ctrl + Shift + H
  3. Share a screenshot of your GPU tab in Task Manager (windows) when the model behaves this way

Thanks

@yagil yagil added the more-info-needed Need more information to diagnose the problem label Feb 2, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
more-info-needed Need more information to diagnose the problem
Projects
None yet
Development

No branches or pull requests

2 participants