Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Zed doesn't work with NVIDIA Optimus on Linux out of the box #22900

Closed
1 task done
Vanuan opened this issue Jan 9, 2025 · 16 comments
Closed
1 task done

Zed doesn't work with NVIDIA Optimus on Linux out of the box #22900

Vanuan opened this issue Jan 9, 2025 · 16 comments
Labels
bug [core label] linux-x11 Linux X11

Comments

@Vanuan
Copy link

Vanuan commented Jan 9, 2025

Check for existing issues

  • Completed

Describe the bug / provide steps to reproduce it

  • I downloaded Zed on my Linux system (Ubuntu 22.04 LTS) with an NVIDIA Optimus setup (XPS 15 9570 with GTX 1050 Ti).
  • Upon launching Zed, the editor doesn't render anything; it just shows a corrupted screen output.

Zed Version and System Specs

Zed: v0.169.0 (Zed Preview)
OS: Linux X11 ubuntu 22.04
Memory: 31 GiB
Architecture: x86_64
GPU: NVIDIA GeForce GTX 1050 Ti with Max-Q Design || NVIDIA || 535.183.01

If applicable, add screenshots or screencasts of the incorrect state / behavior

image

If applicable, attach your Zed.log file to this issue.

Zed.log

@Vanuan Vanuan added admin read Pending admin review bug [core label] triage Maintainer needs to classify the issue labels Jan 9, 2025
@Vanuan
Copy link
Author

Vanuan commented Jan 9, 2025

Expected behavior:
Zed should automatically detect and use the NVIDIA GPU for rendering, or at least inform the user if the GPU configuration is not supported.
Actual behavior:
There's no automatic GPU selection, leading to rendering issues.

Workaround:
Running Zed with the following command forces the use of the NVIDIA GPU and resolves the issue:
> __NV_PRIME_RENDER_OFFLOAD=1 zed

Additional context:
Zed is designed to leverage GPU capabilities for rendering, using the Vulkan API for performance enhancements. However, this creates a liability where Zed fails to handle unsupported GPU setups gracefully.

Suggestions for improvement:
Implement automatic GPU detection and selection, similar to how Steam handles games on Optimus systems.
Provide clear feedback to users if their GPU setup isn't supported, explaining the need for Vulkan and perhaps offering guidance on workarounds.

Feedback:
While I understand Zed's aim for high performance (mentioning 120 FPS for smooth operation), for a text editor, this seems excessive. It's clear this approach is perhaps more about developing a custom GUI and potentially a broader cross-platform framework than just enhancing text editing speed.

The complexity introduced by managing all aspects of rendering (X, Wayland, Vulkan, etc.) might be overkill for what is essentially a tool for editing text. Electron/Chromium might be less performant but simplifies deployment and development significantly on Linux, especially given the challenges with AppImage and desktop environment integration.

Side note:
I'd like to provide this feedback here since platforms like Twitter are too restrictive for detailed commentary, and setting up a personal blog just to discuss Zed seems disproportionate. If you know a better place to provide it, I'd gladly move it there. There's no feedback section in discussions. Maybe Reddit?

@jansol jansol added linux-x11 Linux X11 and removed triage Maintainer needs to classify the issue labels Jan 9, 2025
@jansol
Copy link
Contributor

jansol commented Jan 9, 2025

Zed should automatically detect and use the NVIDIA GPU for rendering

It specifically does not do that because doing so burns through laptop batteries in no time. Instead it prefers the integrated GPU if there is one.

Unfortunately optimus setups that involve an NVIDIA GPU are frequently broken in ways that make it impossible to display anything from the integrated GPU. #22409 tries to work around this.

@Vanuan
Copy link
Author

Vanuan commented Jan 9, 2025

It specifically does not do that because doing so burns through laptop batteries in no time. Instead it prefers the integrated GPU if there is one.

Well, some UI is better than no UI. At least, it could detect there are multiple GPUs on the first start and suggest a choice with "never ask again" option. Currently, the user needs to modify desktop shortcuts to include the environment variable which is suboptimal user experience.

I mean, if the first interaction with zed is a freezed window, it seriously undermines trust in anything Zed offers. At the very least, "can't render anything in Opitmus enviroment, try this workaround" would go a mile.

If you're worried about power consumption, you could display a widget somewhere in the status bar showing graphics card used and how hungry it is.

#22409 tries to work around this.

Is this included in v0.169.0 (Zed Preview)?

@Vanuan
Copy link
Author

Vanuan commented Jan 9, 2025

Found another, simpler workaround

image

Basically, it's a shortcut for switcheroo-control integrated into GNOME:
https://gitlab.freedesktop.org/hadess/switcheroo-control/

Which is a much better tool to manage dual-GPU than just __NV_PRIME_RENDER_OFFLOAD=1

Some useful high-level context:

Managing dual GPU laptops with NVIDIA Optimus on Linux has evolved significantly over the years. Here's a concise overview of the key developments:

Early Challenges and Bumblebee Project:

Initially, Linux lacked native support for NVIDIA Optimus technology, which allows seamless switching between integrated and discrete GPUs to balance performance and power consumption. To address this, the open-source Bumblebee project was introduced, enabling users to manually run applications on the discrete NVIDIA GPU using commands like optirun or primusrun. While Bumblebee provided a workaround, it required manual intervention and did not support automatic GPU switching.

vga_switcheroo:

Introduced in Linux kernel version 2.6.34 (released on May 16, 2010), vga_switcheroo is a kernel subsystem designed for laptop hybrid graphics. It allows users to switch between integrated and discrete GPUs, particularly in systems with a multiplexer (mux) that directs outputs between GPUs. However, switching GPUs with vga_switcheroo typically required restarting the X Window System, making it less convenient for dynamic GPU management.

NVIDIA PRIME and PRIME Render Offload:

In May 2013, NVIDIA introduced initial support for Optimus on Linux with driver version 319.17, allowing offloading of rendering to the discrete GPU. This was further enhanced in August 2019 with driver version 435.17, introducing "PRIME Render Offload." PRIME render offload is the ability to have an X screen rendered by one GPU, but choose certain applications within that X screen to be rendered on a different GPU. The GPU rendering the majority of the X screen is known as the "sink", and the GPU to which certain application rendering is "offloaded" is known as the "source".

Switcheroo Control:

Switcheroo Control is a user-space D-Bus service that integrates with desktop environments like GNOME. It provides a more accessible interface for managing GPU switching, allowing users to choose which GPU to use for specific applications without manual commands, enhancing the user experience.

Current Status of Automatic GPU Switching:

As of January 2025, automatic GPU switching—where the system seamlessly transitions between integrated and discrete GPUs based on workload without user intervention—is not fully supported on Linux. Users must manually select the appropriate GPU for their tasks, which can be less convenient compared to the seamless GPU switching available on other operating systems. Tools like EnvyControl offer a command-line interface to easily switch between GPU modes on NVIDIA Optimus systems, but they still require user input.

While automatic GPU switching isn't fully supported on Linux, you can manually select the NVIDIA GPU for specific applications. When in use, the NVIDIA GPU can adaptively manage its power consumption based on the workload, balancing performance and energy efficiency. This is called NVIDIA On-Demand setting.

In summary, while significant progress has been made in supporting hybrid GPU systems on Linux—from the early days of Bumblebee to the introduction of PRIME Render Offload and user-space tools like Switcheroo Control—automatic GPU switching remains an area for future development. Users currently need to manually manage GPU selection to optimize performance and power consumption.

Sources: https://wiki.archlinux.org/title/Bumblebee https://help.ubuntu.com/community/HybridGraphics https://en.wikipedia.org/wiki/Nvidia_Optimus https://en.wikipedia.org/wiki/Nvidia_Optimus

@ConradIrwin
Copy link
Member

@Vanuan Would you be able to add a section to https://github.com/zed-industries/zed/blob/main/docs/src/linux.md#troubleshooting to describe the problem and the workaround for others that fall into this camp?

I'm also open to adding feature detection as we do for the software-renderer if this particular broken combination is detectable.

@Vanuan
Copy link
Author

Vanuan commented Jan 22, 2025

@Vanuan
Copy link
Author

Vanuan commented Jan 22, 2025

Apparently, after upgrade to 170 I no longer need the workaround. Or maybe it was a graphics driver upgrade? I'm not sure. Previously I was using NVIDIA 545, recently upgraded to 565

@Vanuan
Copy link
Author

Vanuan commented Jan 22, 2025

Ok. Correction. I can still reproduce the issue. But if I already have zed open with a workaround, running zed opens a new window. Apparently, it somehow detects aleady open instance and signals to open a new window there?

@ConradIrwin
Copy link
Member

Exactly right.

Does vkcube work for you on the integrated graphics?

@Vanuan
Copy link
Author

Vanuan commented Jan 22, 2025

@ConradIrwin

Does vkcube work for you on the integrated graphics?

Looks like it doesn't:

Selected GPU 2: NVIDIA GeForce GTX 1050 Ti with Max-Q Design, type: 2

Even if I try to force it:

~$ MESA_VK_DEVICE_SELECT=list vkcube
selectable devices:
  GPU 0: 10005:0 "llvmpipe (LLVM 15.0.7, 256 bits)" CPU
  GPU 1: 8086:3e9b "Intel(R) UHD Graphics 630 (CFL GT2)" integrated GPU
  GPU 2: 10de:1c8c "NVIDIA GeForce GTX 1050 Ti with Max-Q Design" discrete GPU
~$ MESA_VK_DEVICE_SELECT=8086:3e9b vkcube
Selected GPU 2: NVIDIA GeForce GTX 1050 Ti with Max-Q Design, type: 2

That being said, I have the PRIME config set into performance mode. Could that be a factor? It looks like vkcube just ignores the MESA_VK_DEVICE_SELECT ? Or maybe it somehow selects devices by features supported?

@ConradIrwin
Copy link
Member

Thanks for confirming!

I'm going to close this issue for now. To avoid an infinite amount of time debugging Linux graphics setups, our rule of thumb is that if vkcube is working and Zed is not, we'll spend more time looking.

We should merge the docs change to help other people with the same brokenness as you figure it out though.

@Vanuan
Copy link
Author

Vanuan commented Jan 22, 2025

@ConradIrwin
It seems I found what the issue is. I've just changed prime mode to intel and vkcube now selected integrated graphics:

$ sudo prime-select intel
# reboot
~$ prime-select query
intel
~$ vkcube
Selected GPU 0: Intel(R) UHD Graphics 630 (CFL GT2), type: 1

zed also works without any issues.

So we now have confirmed the bug is with how Zed selects the GPU. It ignores the NVIDIA prime setting and selects Intel GPU which is disabled in the performance mode. Why it doesn't use the default Vulkan selection behaviour?

@Vanuan
Copy link
Author

Vanuan commented Jan 22, 2025

Interestingly the resume from suspend also works without issues when using intel graphics.

@Vanuan
Copy link
Author

Vanuan commented Jan 22, 2025

I'm going to close this issue for now. To avoid an infinite amount of time debugging Linux graphics setups, our rule of thumb is that if vkcube is working and Zed is not, we'll spend more time looking.

Huh? Shouldn't you keep the issue open according to that logic? What do you mean by "spend more time looking"? Zed clearly has a bug here.

You might have misinterpreted what I'm saying. vkcube is working by default without any issues. zed, on the other hand requires forcing NVIDIA graphics to be selected. So vkcube is working, zed is not.

@ConradIrwin
Copy link
Member

ConradIrwin commented Jan 22, 2025

Ok, so sounds like there are three possible states you can be in:

  • "initial" where vkcube and zed are both broken
  • "fixed" where __NV_PRIME_RENDER_OFFLOAD=1 is exported and both vkcube and zed work (though crash on resume from sleep)
  • "mixed" where you've run sudo prime-select intel. In this third state vkcube works and zed does not (and vkcube does not crash on resume from sleep?)

It'd be helpful if you could add some logging to the GPU selection process on our side to see why it might diverge from vkcube so we can be sure we're always selecting the GPU you expect.

@Vanuan
Copy link
Author

Vanuan commented Jan 22, 2025

Well. It's more complicated than that. Let me think how to unpack it.

First of all, there's only one 100% proof force selection mechanism I described here,
VK_ICD_FILENAMES.

This makes the apps blissfully unaware that there's any other GPUs that they need to select from:

~$ MESA_VK_DEVICE_SELECT=list VK_ICD_FILENAMES=/usr/share/vulkan/icd.d/intel_icd.x86_64.json vkcube
selectable devices:
  GPU 0: 8086:3e9b "Intel(R) UHD Graphics 630 (CFL GT2)" integrated GPU

~$ MESA_VK_DEVICE_SELECT=list VK_ICD_FILENAMES=/usr/share/vulkan/icd.d/nvidia_icd.json vkcube
selectable devices:
  GPU 0: 10de:1c8c "NVIDIA GeForce GTX 1050 Ti" discrete GPU

~$ MESA_VK_DEVICE_SELECT=list VK_ICD_FILENAMES=/usr/share/vulkan/icd.d/lvp_icd.x86_64.json vkcube
selectable devices:
  GPU 0: 10005:0 "llvmpipe (LLVM 15.0.7, 256 bits)" CPU

Let's test what we see. Let's exclude forcing llvmpipe (software emulation) for simplicity.
Here's a table:

prime-select Mode VK_ICD_FILENAMES Setting vkcube Behavior Zed Behavior
intel intel_icd.x86_64.json ✔️ Works: Uses Intel GPU. ✔️ Works: Uses Intel GPU.
Detailsvkcube Logs: Selected GPU 0: Intel(R) UHD Graphics 630 (CFL GT2), type: 1.
DetailsZed Logs: Adapter: "Intel(R) UHD Graphics 630 (CFL GT2)". Successfully initializes Vulkan.
intel nvidia_icd.json ✔️ Works: Uses NVIDIA GPU. ✔️ Works: Uses NVIDIA GPU.
Detailsvkcube Logs: Selected GPU 2: NVIDIA GeForce GTX 1050 Ti, type: 2.
DetailsZed Logs: Adapter: "NVIDIA GeForce GTX 1050 Ti". Successfully initializes Vulkan.
nvidia (Performance) intel_icd.x86_64.json ⚫ Black Screen: Intel GPU disabled. ❌ Fails: Intel GPU disabled.
Detailsvkcube Logs: No output (black screen).
DetailsZed Logs: No rendering (Intel GPU disabled).
nvidia (Performance) nvidia_icd.json ✔️ Works: Uses NVIDIA GPU. ✔️ Works: Uses NVIDIA GPU. Crashes on suspend/resume.
Detailsvkcube Logs: Selected GPU 2: NVIDIA GeForce GTX 1050 Ti, type: 2.
DetailsZed Logs: Adapter: "NVIDIA GeForce GTX 1050 Ti". Crashes on suspend/resume.
on-demand intel_icd.x86_64.json ✔️ Works: Forces Intel GPU. TBD (Likely ✔️ Works if Intel GPU is accessible).
Detailsvkcube Logs: Selected GPU 0: Intel(R) UHD Graphics 630 (CFL GT2), type: 1.
DetailsZed Logs: Untested. Likely works if Intel GPU is accessible.
on-demand nvidia_icd.json ✔️ Works: Forces NVIDIA GPU. TBD (Likely ✔️ Works if NVIDIA GPU is accessible).
Detailsvkcube Logs: Selected GPU 2: NVIDIA GeForce GTX 1050 Ti, type: 2.
DetailsZed Logs: Untested. Likely works if NVIDIA GPU is accessible.

Let me test all these configurations first. This would be our source of truth.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug [core label] linux-x11 Linux X11
Projects
None yet
Development

No branches or pull requests

4 participants