-
Notifications
You must be signed in to change notification settings - Fork 258
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pull-based I/O may interfere with fully compositional concurrency #599
Comments
seems to me that you're basically just missing the equivalent of |
something kinda like this pseudocode:
|
One potential misunderstanding I spotted: I expect to be able to call another entry point in a component while a future returned by a previous call hasn't completed yet, this allows running multiple async functions concurrently, while still being single threaded and not reentrant since you're not calling an entry point at the same time as calling poll on another future, they're interleaved. the javascript runtime would handle this by having entry point futures complete when the associated javascript promise is resolved, not waiting until the event loop is empty, the event loop is resumed next time any future from that component is polled, finally being run by quiesce's future being polled. |
@programmerjake Indeed, I would expect to be able to concurrently invoke different functions on the same component instance, and have execution of these interleaved as per their constituent i/o & timer events. This is certainly possible at the level of a Javascript runtime, for example -- it's just the ordinary way one concurrent library would interact with another concurrent library. In StarlingMonkey, it is not possible to do concurrent invocation. In fact, the limitation is more severe than this: all work scheduled by one invocation must fully complete before another invocation can be initiated. It essentially means you have hyper-sequentialized invocation of one component instance to another: any attempt by one instance to invoke a second function on the other instance, while a first invocation is still running, will block; the work will not actually begin until the invocation of the first function—and all concurrent workloads initiated by the first functioni—have entirely wrapped up. At first I thought this might only be the way StarlingMonkey is currently implemented, but I now think the problem is more fundamental than this, and you can see it with a simple example of setting a timer. A timer in any language will ultimately end up as So I have come to believe this problem is not really with StarlingMonkey, per se, but rather, with the combination of WASI's pull-based I/O and single-threading (a multi-threaded component could potentially use one thread per invocation, which could handle the concurrency problem even with blocking pull-based I/O). To achieve fully compositional concurrency, which allows one component instance to concurrently invoke functions on another component instance—and in the presence of single-threading—it seems to me that a host must have the ability to send events to the instance (such that an instance can make some progress before yielding, but then wait until the host has decided there is something new to do), which implies async / push-based I/O. There is a trend toward async I/O anyway. This trend away from pull-based I/O to push-based I/O, combined with my belief (perhaps mistaken), that many developers will be confused why they cannot treat components in the same way they treat libraries (i.e. why there is this blocking / bottlenecking / over-sequentialization happening), is why I created the issue. Hopefully it will lead to some illuminating discussions (for me, and for others who have my same questions), regardless of whether it leads to any changes in WASI. |
One other note: a push-based I/O component can always be packaged as a pull-based I/O component (regardless of single- or multi-threading), given the existence of a blocking poll, because you just need an outer event loop that polls, feeding events to the push-based component. A pull-based I/O component cannot be packaged as a push-based I/O component without multi-threading. With multi-threading, you can have one thread (the event loop) that reads from a blocking queue, and then the push-based interface simply deposits events onto the blocking queue, triggering reactivation of the event loop. Not efficient, but possible. So it's my speculation that a push-based I/O is somehow more "primitive" than pull-based I/O. |
Yes, totally agreed that, as it stands in 0.2, concurrent components using |
@lukewagner Thank you for the detailed response! I watched your talk but at the time (great talk and visuals!), but I was only thinking of composing sync components with async components in a seamless way, so the implications for "concurrent invocation" did not resonate with me. I will watch your talk again and check out the async-pt1 branch and get back if I have any further questions or thoughts. I'm very happy this is being worked on and now I can better understand just how important preview 3 may be to the component model. |
If you haven't seen it yet, here's a working preview of the guest experience for composable concurrency, including examples which spawn tasks: https://github.com/dicej/isyswasfa |
this can be solved by the quiesce method, where the component can depend on quiesce being called or some future from the component being polled until the component decides it's done doing all work, despite futures from normal entry points completing when their promises are resolved, rather than waiting for all work to halt before completing. if you're worried about the component not being called back into for too long or forever, almost any single-threaded concurrency model has this problem, it isn't unique to polling or to pushing events. |
to be clear, the kind of polling I am thinking of is like in Rust async/await, where every poll invocation has a waker that it can register to be called when some event happens, after which the poll method on the future will be scheduled to be called again. |
@dicej This is very helpful! I have some idea of how this will work, and several places to learn further. |
These ideas are not fully formed yet, but I wanted to put them out here anyway:
The component model really wants us to be able to treat and think of components as libraries--the cross-language, portable building blocks that make up our applications. However, recently I have found a set of use cases that interfere with this way of viewing components; and, it could be argued, interfere with fully compositional, cross-component concurrency.
The core WASI interface is built on polling, which is pull-based I/O. This seems a logical and sound choice as most operating systems expose pull-based I/O and most applications are built on such interfaces.
However, problems arise when one expects to be able to use one component from another, as one uses one library from another.
To see these issues, first imagine we are building libraries and not components. Let's say, in particular, that we are building a Javascript library to perform auto-complete, which interacts with various indexes and databases. Javascript is both single-threaded, but also concurrent, which makes it a nice fit for wasm-wasi.
In this case, we might have a
async function autoComplete(prefix) { }
that returns aPromise
. This auto-complete algorithm might simultaneously request information from two different APIs usingfetch
. In addition, the auto-complete algorithm may wish to timeout one of these requests by usingsetTimeout
--if the request has not returned in 60 seconds, then it would be terminated.This is relatively trivial to do using Javascript, only about 10 - 20 lines of code depending on the approach used. In the end, you have a single-threaded but concurrent function that satisfies its requirements.
There is no issue using such an auto-complete library from another Javascript library. In particular, when you call
autoComplete
, it returns a promise right away, which you are free to await on right now, if you wish, or you can continue to do other processing and not await on the promise immediately (or ever). In no case does the call toautoComplete
semantically "block" your application. Even awaiting the promise does not interfere with other concurrent operations that are happening in your library.Now, let's instead imagine that we compile our Javascript library to a component, which exposes a
auto-complete
method. Now we wish to use this component from other components, in the same fashion that we wished to use the auto-complete library from other libraries.While the component model makes this possible and reasonable to do, in a cross-language way, it turns out that the concurrent nature of the auto-complete component does not compose in the way we would expect with another component.
In particular, if in a second component, we call
auto-complete
from the auto-complete component, then the event loop of the Javascript component will not finish running until all pending i/o events have been completed. In our case, this means thatauto-complete
will not return until both fetches return a result, and also the setTimeout callback is invoked.In other words, unlike when calling a Javascript library, invoking a function on the component instance will block our application for at least a minute. During this minute, we will not be able to invoke any other function on the instance. Why? Because its event loop is already busy handling the protracted execution of the
auto-complete
function, and will not yield until all pending events have been processed (including the scheduling event created by thesetTimeout
).To me, this behavior is quite surprising. It means that even though a component appears to be a library, you can only call one function on an instance at a time, and those invocations have to wrap up fully before you are allowed to call another function. Not only is this unexpected behavior (for me), but it seems hard to reason about, because it is so easy (in Javascript, at least) to "do things later" and "launch concurrent processes" (like
fetch
) and yet return a value right away.I could be wrong, but I believe that many developers will expect components to act like libraries, and to expect compositional concurrency, in the sense that the concurrent behavior of two composed components should be the same as the concurrent behavior of two composed libraries.
Now, this appears not to be easy to fix. The fundamental problem appears to come from the pull-based nature of WASI I/O. In order for an event loop to return prematurely (before pending I/O is complete), there would have to exist some exported function in the WASM component, which could receive external events (such as scheduling events or I/O events).
Achieving fully compositional concurrency, as defined here, would seem require something closer to "async I/O" rather than poll-based I/O. This would allow the event loop to yield after it is done processing pending events, which means when no more productive work can occur, and the component instance is waiting on a timer or I/O, then another component may call into it (concurrently), and the execution of such new invocations will be interleaved with the progressing results of prior invocations--which is exactly how concurrent operations behave between two composed libraries (in Javascript or other PL).
This might be a small change to the WASI interface (pull-based to push-based), however, I think it would have quite significant ramifications downstream. Supporting async I/O in some environments (like StarlingMonkey) could be straightforward, but I can see it being much more work in languages whose standard libraries are built on and assume pull-based I/O.
cc @lukewagner
The text was updated successfully, but these errors were encountered: