-
-
Notifications
You must be signed in to change notification settings - Fork 22
[API design] Streaming upload API: push-style or pull-style? #199
Comments
I'm in complete agreement with the conclusion. I wonder if there's a way we could include a push-based approach for a lower-level API, something that we've talked about a couple times about two layers of API: one that's basically |
Yeah, it's possible we'll want to make different decisions in a lower-level API. I'm not 100% sure it will make sense though. A low-level API doesn't have to be as comfortable to use or support retries, so those downsides don't apply. But we'd still need to figure out how to deal with the state complexity, and need some special-case mechanism to avoid aborting the request body when we get an early response. We'll also need to double-check what grpc actually needs before committing to anything... my clever idea to make bidi HTTP look like an Upgrade/CONNECT is clever and all, but it still assumes that bidi HTTP users are OK with waiting for the response headers before they switch into bidi mode, and I'm actually not 100% sure whether that's true of grpc or not. (@standy66, do you know?) Also, grpc only cares about bidi support over HTTP/2, and with HTTP/2 we don't need to abort the response body on early responses, so that whole issue goes away... |
The way I thought about this was like how websockets are usually implemented. You write stuff to the request, read responses then. If you want to alternate between the two (extreme case: multipart request and response), you can. The server responding early is not a problem as I see it. If the user doesn't stop sending data after an early response, we can raise an exception when they try to send more. As this implies, I strongly favor push-style over pull-style. |
[This is partially discussed in #124, but that's a huge omnibus issue so I'm pulling it out into a more focused one. This is also capturing some notes from a voice chat @sethmlarson and I had last week.]
Say you're uploading some data as a request body, and you want to generate it on the fly. There are two basic API approaches that hip could offer. In a "push" style API, the user pushes us data as it becomes available:
In a "pull" style API, the user provides an (async) iterable, and hip pulls data out of it:
These are kind of duals of each other. Generally, Trio's style prefers the "push" approach, for two reasons: First, keeping the loop in user code makes the control flow is more explicit and transparent, so I think it's easier to reason about. And second, if you have a push-style API, then it's easy to implement a pull-style API, but going the other direction is much more awkward:
Push-style APIs are also much more natural if we want to support bidirectional HTTP, i.e. letting people send a bit of the request, then read a bit of the response, then send a bit more of the request, etc. (I'm not sure we do want to do this, but still, it's a trade-off to consider.)
However, for hip's purposes, the push-style approach has some significant downsides:
For HTTP uploads specifically, it's very common to already have some kind of object that represents the body (a byte-string, some JSON, an open file, an async generator, etc.), and it's natural to want to just pass this object as a simple kwarg like
body=
. We probably don't want to make people explicitly write afor
loop every time. So, if we supported a push-style API, then we'd also end up adding a pull-style option as sugar on top of it, and then we have two APIs.Push-style APIs expose a lot more states in the API. With a classic
requests.get
API, you call the function, and it doesn't return until at least the whole request body has been delivered and the response headers have been received. With a push-style API, I guess we'd need to... first return an object that represents the request-in-progress, and then the user would push data through it for a while, and then when they mark the request body complete then that would need to return a second object that has the response headers etc.? And what happens if the server starts responding early? This makes the API much more complex to design and use.Push-style APIs are even more awkward if you want to handle redirects and retries, where you might have to rewind the body and re-send it. So now we need some way for hip to tell the user that they've been redirected and need to start over on pushing the body, etc.
Putting this all together, I think the conclusion is that we should follow requests instead of Trio, and expose a pull-style API instead of a push-style API.
If we do decide that bidi HTTP is an important use case (e.g. so grpc clients can use hip), then another possibility would be to make the API look similar to an Upgrade/CONNECT. For Upgrade/CONNECT, I imagine that our API will involve sending a request, getting a response object, and then if it succeeded calling some method on the response object to get a bidi byte stream for the new protocol to use. For bidi HTTP, we could use a similar API pattern, where during the request you set some special flag that says "I want to do bidi HTTP", and then we use some special internal processing that doesn't send any request body immediately, but then lets you use the pull-out-a-byte-stream method to get a byte stream referring to the request and response bodies. I think that will end up being much more natural.
The text was updated successfully, but these errors were encountered: