-
-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WebAssembly multithreading tracking issue #4078
Comments
For the ECS, there are two relevant bits: |
Another thing to be aware of is that WebAssembly memory objects that are shared cannot be resized. They must declare an "initial" and "maximum" size. There's good discussion of some of the quirks that introduces in this thread: WebAssembly/design#1397 |
Thanks for the heads up! Oh boy is that ever a can of worms; I think I’ll defer the wasm memory stuff, keeping this as only a wasm multithreading MVP, and hopefully let someone else come up with a strategy for dealing with that. I will definitely add it to #4279, as this is a pretty big thing to keep in mind and investigate as we work on bevy’s Web UX story. |
Since SharedArrayBuffer requires some cors headers, I made a replacement for basic-http-server that allows setting headers. Might find it useful here for testing and running the examples when work on this resumes. Example with the headers needed for SharedArrayBuffer:
|
This is really nice, thanks for sharing it here! |
What's the state on this? Has there been progress since it was put on hold half a year ago? |
No real progress. No one is too motivated to do anything, since the memory model for shared array buffer is going to make it very hard to work with the ecs. |
Reading through the discussion from unity devs, linked above it seems that the issue is mainly a blocker for mobile devices. Also hopeful is that the issue had some progress only 2 weeks ago, with from my limited understanding seems to be some kind of wasm equivalent of 'free()'. This means it should be possible to resize threads? From how I am reading it, there really isn't much blocking multithreading wasm for desktop? |
I spent the last few hours and wrote some very sloppy code that shows a few key areas Bevy needs to change to get Wasm multithreading support: kettle11@c8c2eb5 The code spawns web workers instead of threads and appears to almost work. Sometimes it will run for a few moments with multiple threads and successfully not crash! These issues prevent it from fully working:
I'm sure there are other issues once those two are resolved. That said, in my opinion there's no hard technical barrier preventing Bevy from being multi-threaded on web. Snippets of the above code were taken from this blog-post: https://www.tweag.io/blog/2022-11-24-wasm-threads-and-messages/ |
The second issue that you found could be worked around by running Bevy on a
web worker itself, using OffscreenCanvas for the rendering and using
postMessage to forward keyboard and mouse events from the main event loop.
|
Would adding that overhead to input be noticeable? |
There is a good write up here about the performance of It depends on the size of the payload, but it seems anything up to 10kb would take less than a millisecond. |
I assume that less than a millisecond is acceptable for latency of input, but yeah its another addition to latency. Latency in input today is horrible comparet to the days before USB So percentage and usagewise I doubt it will be very noticeable at all :) |
I tried to put the entire bevy app in a web worker, to try and solve the issue with async-executor. But then I ran into issues due to what I guess are Winit trying to things to So maybe trying to put async-executor into a web worker, could be a feasible next solution. also since there are some server headers required for web worker functionality, as in @kettle11 run script. And i could not get his devserver to work ( some kind of dependecy issue ) I also rolled my own bloated devserv based on rocket, can be found here: https://github.com/TotalKrill/devserv |
Rebased @kettle11 work to see if anything had changed with all the changes in bevy main. It doesnt crash, but it doesnt do anything either after initializing all the workers either. Could not see anything in the logs either... Heres the rebased branch if anyone is curious: |
I agree with #4078 (comment). I don't know whether resizing shared memory was truly a problem in 2021, but it certainly isn't now. Shared memory has an initial size, a maximum size, and is growable up to the maximum. There are zero functional limitations compared to non-shared memory, and the API is almost identical except that shared memory must have a maximum size set. Any single-threaded Bevy/wasm app that currently exists already has a maximum size. Wasm-bindgen sets one unconditionally, with a default maximum of 1GiB and an absolute max of 4GiB for wasm32 of course. (I detail how to change the maximum below.) The only practical difference is when (not if) you see an error when you try to set the maximum very high in a 32-bit browser. More details below if you want to learn more. The only thing missing from the story in 2021 would have been browser support. Safari implemented shared memory in late 2021, and as of that moment, all major browsers do (with the HTTP headers of course). The things to be concerned about remain spinning up workers at all, avoiding locking the main thread, the other things listed in wasm-bindgen's section on the caveats, and any other DOM objects or Web APIs that can't be used from a web worker (like HtmlCanvasElement directly instead of via OffscreenCanvas). Regarding canvases specifically, winit 0.29 beta supports the main thread + web workers scenario (rust-windowing/winit#2778 and rust-windowing/winit#2834) and Bevy will be able to take advantage of that if it can ship an OffscreenCanvas to a renderer web worker. In summary, Bevy multithreading on wasm is probably much closer than it has seemed. I have attempted to clear up any lingering confusion in some detail below. Click to expand.
In all, shared memory is fine. I don't see any cans of worms. There is no need to do anything special in Bevy to handle resource limits, especially not within the ECS. Any wasm project should choose an appropriate amount of virtual address space to map, let the global allocator panic on OOM, and call it a day. I barely even think it's Bevy's job to tell people not to attempt to map 1GiB+ of memory if they want their apps to run on 32-bit systems, but you can list a caveat for that with some notes about which browsers that's likely to be. Frankly shared memory is a better choice because it lets you fail faster while choosing a maximum size. |
Wgpu/bevy's renderer isn't threadsafe on wasm (wgpu just used to lie about it before wgpu 0.17 because wasm threading wasn't really a thing and it's threadsafe on native, just not on wasm). If you want to test if your multithreading actually works with the renderer on wasm you need to remove the bevy/crates/bevy_render/Cargo.toml Line 63 in de8a600
It sounds like it might be possible to run the renderer in a web worker and have it work (as long as you pin it to that webworker and don't reference it's resources from other threads)? edit: Tracking issue for renderer #9304 |
# Objective This gets Bevy building on Wasm when the `atomics` flag is enabled. This does not yet multithread Bevy itself, but it allows Bevy users to use a crate like `wasm_thread` to spawn their own threads and manually parallelize work. This is a first step towards resolving #4078 . Also fixes #9304. This provides a foothold so that Bevy contributors can begin to think about multithreaded Wasm's constraints and Bevy can work towards changes to get the engine itself multithreaded. Some flags need to be set on the Rust compiler when compiling for Wasm multithreading. Here's what my build script looks like, with the correct flags set, to test out Bevy examples on web: ```bash set -e RUSTFLAGS='-C target-feature=+atomics,+bulk-memory,+mutable-globals' \ cargo build --example breakout --target wasm32-unknown-unknown -Z build-std=std,panic_abort --release wasm-bindgen --out-name wasm_example \ --out-dir examples/wasm/target \ --target web target/wasm32-unknown-unknown/release/examples/breakout.wasm devserver --header Cross-Origin-Opener-Policy='same-origin' --header Cross-Origin-Embedder-Policy='require-corp' --path examples/wasm ``` A few notes: 1. `cpal` crashes immediately when the `atomics` flag is set. That is patched in RustAudio/cpal#837, but not yet in the latest crates.io release. That can be temporarily worked around by patching Cpal like so: ```toml [patch.crates-io] cpal = { git = "https://github.com/RustAudio/cpal" } ``` 2. When testing out `wasm_thread` you need to enable the `es_modules` feature. ## Solution The largest obstacle to compiling Bevy with `atomics` on web is that `wgpu` types are _not_ Send and Sync. Longer term Bevy will need an approach to handle that, but in the near term Bevy is already configured to be single-threaded on web. Therefor it is enough to wrap `wgpu` types in a `send_wrapper::SendWrapper` that _is_ Send / Sync, but panics if accessed off the `wgpu` thread. --- ## Changelog - `wgpu` types that are not `Send` are wrapped in `send_wrapper::SendWrapper` on Wasm + 'atomics' - CommandBuffers are not generated in parallel on Wasm + 'atomics' ## Questions - Bevy should probably add CI checks to make sure this doesn't regress. Should that go in this PR or a separate PR? **Edit:** Added checks to build Wasm with atomics --------- Co-authored-by: François <[email protected]> Co-authored-by: Alice Cecile <[email protected]> Co-authored-by: daxpedda <[email protected]> Co-authored-by: François <[email protected]>
With #12205 a first step towards getting Bevy multithreaded on web has been merged. Now it is possible to build a Bevy project with multithreading enabled, even if Bevy internals are not yet multithreaded. A short guide for how to try it out yourself:
set -e
RUSTFLAGS='-C target-feature=+atomics,+bulk-memory' \
cargo build --example breakout --target wasm32-unknown-unknown -Z build-std=std,panic_abort --release
wasm-bindgen --out-name wasm_example \
--out-dir examples/wasm/target \
--target web target/wasm32-unknown-unknown/release/examples/breakout.wasm RUSTFLAGS explanationRust's default Wasm target
devserver --header Cross-Origin-Opener-Policy='same-origin' --header Cross-Origin-Embedder-Policy='require-corp' --path examples/wasm CORS Explanation:When Rust is compiled for Wasm with Now to run some work on another thread you can use a crate like use wasm_thread as thread;
thread::spawn(|| {
for i in 1..3 {
log::info!("hi number {} from the spawned thread {:?}!", i, thread::current().id());
thread::sleep(Duration::from_millis(1));
}
}); Important: The browser forbids blocking on the main thread, so take care to never call any code on the main thread that will block / wait on another thread. If you absolutely need a workaround you can busy-loop instead, as Rust's memory allocator itself does. You can also use crates like |
Note that the blocker on getting async-executor to properly initialize on multithreaded wasm should be resolved with smol-rs/async-executor#108. |
Any progress on that? |
@alice-i-cecile @superdump what are the latest plans for Bevy's task pool? There was discussion of migrating back to Rayon at some point, is this still on the table? |
I'm not opposed to that idea, and it's worth exploring. It won't be a personal priority for me right now though. |
Is there a way for the Bevy CLI prototype to support this development? We have full control over the (default) So we should be able to set the server-side headers and maybe even load the Bevy app in a worker if necessary. Is there anything else we can do? Basically I would like to make it easier to contribute to this issue so we can implement it faster |
UPDATE: This is on hold while
TaskPool
andScope
are reworkedMotivation
Currently, Bevy can only run single threaded on WebAssembly. Bevy's architecture was carefully designed to enable maximal parallelism so that it can utilize all cores available on a system. As of about six months ago the stable versions of all browsers have released the web platform features needed to accomplish this (
SharedArrayBuffer
and related CORS security functionality). I think now is a good time to attempt to make Bevy run natively in the browser like it does on the desktop: fully multithreaded.There are three distinct tasks that will enable the accomplishment of this goal:
task_pool::{Scope, TaskPool, TaskPoolBuilder}
that run on wasm calledwasm_task_pool::{Scope, TaskPool, TaskPoolBuilder}
, and use those instead of thesingle_threaded_task_pool::{Scope, TaskPool, TaskPoolBuilder}
(TODO: create issue and link here)a. Contribute the functionality needed to to do background multithreaded audio using WebAssembly via the web platform's AudioWorklet API to the upstream dependencies of bevy_audio (TODO: create issue and link here):
i. cpal (TODO: create issue and link here)
ii. rodio (TODO: create issue and link here)
iii. NOTE: as outlined below in the "Insights provided by developers who have tried to make things that run multithreaded on wasm" section, it may be necessary to PR wasm-bindgen to solve this issue Unblock AudioWorklets: Find an alternative to TextEncoder / TextDecoder rustwasm/wasm-bindgen#2367
b. Make necessary changes to bevy_audio to make use of this added upstream functionality (TODO: create issue and link here)
a. Insights from @alice-i-cecile below. "For the ECS, there are two relevant bits:"
i. Our multi-threaded executor
ii. Parallel iteration over queries
NOTE: as outlined below in the "Insights provided by developers who have tried to make things that run multithreaded on wasm" section, if we need to do something where we cannot use wasm-bindgen we will need to manually set the stack pointer in our code because this is one of the things wasm-bindgen does. @kettle11 has put this functionality into a tiny crate:
https://github.com/kettle11/wasm_set_stack_pointer
Background on
SharedArrayBuffer
There is a good reason Bevy, and many of the existing projects that run on wasm only run single-threaded. Shortly after the time of the initial introduction of the
SharedArrayBuffer
web API - which would allow true unix-like pthread-style mutlithreading using wasm in the browser - the Spectre exploit was discovered.Due to SharedArrayBuffer being a wrapper around shared memory, it was a particularly large vector for Spectre-style exploitation. In order to maintain their strong sandboxing security model, browsers decided to disable the feature while a proper solution was developed. Unfortunately, this eliminated the necessary functionality to allow true multithreading on wasm. What existed in the interim was a much slower emulation of threads using WebWorker message passing.
Thankfully, as of about six months ago all browsers have re-enabled a redesigned and secure version of SharedArrayBuffer.
According to this article on the chrome development blog "Using WebAssembly threads from C, C++ and Rust" https://web.dev/webassembly-threads/, true pthread-style multithreading is now possible on wasm in all browsers, with the small corollary that users may need to write a small specialized javascript stub to get it working exactly in the manner they need. Given that it has been stable for this long, and that some chrome developers have even published a github repository with an implementation for this for rayon using wasm-bindgen, I think now is a good time to investigate how to make Bevy run natively in the browser like it does on the desktop, and try implementing this to see if it will actually work.
Insights provided by developers who have tried to make things that run multithreaded on wasm
@kettle11 provided some good insights into quirks and solutions to multithreaded wasm on discord here on 19 November 2021:
"""
In the past I got AudioWorklet based audio working with multithreaded Rust on web. It's certainly possible.
When working with wasm-bindgen it requires some messy code because wasm-bindgen uses the
Javascript API TextDecoder which isn't supported on AudioWorklet threads.
The way I got around that is by not using wasm-bindgen on the AudioWorklet thread,
but that requires a few hacks:
Scanning the Wasm module imports and importing stub functions that do nothing for every Wasm-bindgen import.
This is OK because the audio thread can be made to be pretty simple and avoid doing direct wasm-bindgen calls.
Allocating a stack and thread local storage for the worker. wasm-bindgen's entry-point does this normally, but
wasm-bindgen's entry point also calls the main function which we don't want for the AudioWorklet thread.
So we need to use our own entry point and manually set up the stack / thread local storage.
I opened a wasm-bindgen issue about theTextDecoder thing about a year ago:
rustwasm/wasm-bindgen#2367
Also wasm-bindgen solves the "how do we set the stack pointer?" issue by preprocessing the Wasm binary and
inserting the stack allocation code, but I found a way to do it without that which I put together into a tiny crate:
https://github.com/kettle11/wasm_set_stack_pointer
"""
Resources
The text was updated successfully, but these errors were encountered: