Basic example of Filter/Service middleware #235

fizx · 2020-06-15T17:50:55Z

Protocol-specific middleware means writing O(protocol*middleware) adapters. By making protocol-agnostic middleware, we enable writing O(protocol+middleware) adapters. Fewer adapters makes making maintenance of existing adapters easier, and makes it easier to add new middlewares or protocols.

For generic middleware, let's choose the Service/Filter model made popular by Finagle [1], and also adopted by Facebook [2] and others. A Service is anything that processes requests into responses, and a Filter is a wrapper around a Service that can transform requests/responses.

Using these primitives, it's possible to model both clients and servers as Services, and any middleware as a Filter.

The downside of this is that when making a new Service, you have to provide the adapter to transform a service implementation into a Service and back. You can see this in the PR as we transform http.Client<->Service. But the upside is that you get Filter reuse for free. As the number of Filters and protocols expands, this reuse becomes incredibly useful and valuable.

This is a reference implementation for discussion. I imagine that we can slowly move existing middleware if we want to pursue this direction.

[1] https://twitter.github.io/finagle/guide/ServicesAndFilters.html
[2] https://github.com/facebook/wangle

pacejackson

I'm not super enthusiastic about this approach, this is basically what go-kit does with it's Endpoint interface and the end result is you just end up writing a bunch of boilerplate to convert your things to and from your interface in every service. We used to do this and moved away from it because it seemed better to just have the annoyance of re-implementing middleware for each kind of middleware rather than pushing all that boilerplate down to our services.

I think the reality is things like this don't work that great in a language without generics. What we do to make this as painless as possible is implement all of the actual "logic" for middlewares in it's own function and the actual Middleware is just a wrapper around calling that.

fizx · 2020-06-15T18:38:03Z

I'm not super enthusiastic about this approach, this is basically what go-kit does with it's Endpoint interface and the end result is you just end up writing a bunch of boilerplate to convert your things to and from your interface in every service

Go-kit gets things almost right, but in a way that introduces lots of boilerplate, which would not be present in this similar interface. Go-kit uses Endpoint to model a single RPC call. Service, which is a very similar interface to endpoint, models the entire service.

In order to have the maintenance gains associated with my O(protocol+middleware) argument, you should not vary on the interface of the service! If you have to vary on the interface of the service, by adding method-specific adapters, then you are now O(protocol*methods+middleware) adapters, which is a huge step backwards, especially with many methods.

So I think go-kit isn't quite the right thing, but I'm not quite ready to damn this approach for similarity, especially considering the success of similar models elsewhere.

I think the reality is things like this don't work that great in a language without generics.

This is true. However, based on past experience with similar libraries, the annoyance can be mild.

Netty, perhaps the most popular server library in Java, currently implements middleware while requiring similar type assertions to up/downcast requests as they move through the pipeline. For example, SSL termination is implemented as a middleware on a TCP stream, and later in the same pipeline you can process HTTP headers. To do so, you are casting the input objects to whatever their current type is. In writing low-level Netty, this lack of generics is mildly annoying, but the ability to write in their pipeline abstraction is essential!

In our specific case, the burden of typecasting goes to the baseplate.go implementer, not the end-user, and it represents less burden than the alternatives. When implementing a new protocol, the burden would be 2 type assertions, which is less than implementing an adapter for every middleware.

So all-in-all, I share your apprehension based upon go-kit, but I think this approach hits a middle ground where we reap much of the benefits of go-kit, with very little of the cost.

pacejackson · 2020-06-15T18:43:51Z

That's a good point about the difference between this and go-kit's approach, I guess the thing that's really missing for me is seeing an actual example.

The current code is really just in a vacuum (and all new), what would the existing baseplate.go servers and clients (and their middlewares) look like with this approach and what would a service actually look like? To that end, it's probably worth at least looking into?

TL;DR: you have a good point about not dismissing this outright due to past experiences but I think we need some more details to make an informed call.

fizx · 2020-06-15T18:57:02Z

The current code is really just in a vacuum (and all new), what would the existing baseplate.go servers and clients (and their middlewares) look like with this approach and what would a service actually look like? To that end, it's probably worth at least looking into?

There is at least an http.Client here, and also you can see some sample filters in the test case to get an idea of what they would look like.

I'm trying not to rewrite all of the things immediately :) Is there some subset that I could translate that would help get the idea across?

konradreiche · 2020-06-15T19:04:13Z

I think this can make a lot of sense, especially with middlewares which clearly should apply across endpoints, i.e. logging middleware.

fishy

None of the examples/inspirations you linked are in go, for a obvious reason: go's lack of generics. I would probably be more supportive for this idea after we actually have generic in go2. Right now, I do try to avoid interface{} as much as possible.

But even after we had generics, there are still issues with this approach. One obvious one is that a lot of the middlewares are not really interchangeable. A majority of our middlewares are around header handling, and we do need different header handling both between thrift and http, and between client and server. You cited code repetition as a reason to go with this approach, but in reality we actually implemented the things that's common between them (e.g. create the server span after we get all the headers) as shared, and the difference between the thriftbp and httpbp server middleware is just the different part (how to actually extract the headers). So I would say we actually already shared the code as much as we can, and this approach will not provide meaningful improvement on that front.

fizx · 2020-06-15T20:42:52Z

A majority of our middlewares are around header handling, and we do need different header handling both between thrift and http, and between client and server.

As we move forward, we will add more low-level middlewares. Circuit-breaking, retries, load-shedding are largely protocol-independent.

You cited code repetition as a reason to go with this approach, but in reality we actually implemented the things that's common between them (e.g. create the server span after we get all the headers) as shared, and the difference between the thriftbp and httpbp server middleware is just the different part (how to actually extract the headers). So I would say we actually already shared the code as much as we can, and this approach will not provide meaningful improvement on that front.

As the number of protocols/clients we would like to have nice network hygiene, spans, retries, etc expands (http client, redis, memcached, postgres, grpc, cassandra, aws client, kafka, rabbitmq, etc), not only will we want code reuse, but we will want to verify that the implementations are consistent across all of the protocols. We could easily end up with subtle differences in the way spans are wrapped across all of our clients, and this would be hard to verify, leading to subtle bugs.

As a practical matter, many of the protocols will be implemented by the trailblazers, and they may not understand the subtleties of the middlewares, or of the baseplate.go implementation itself. As an example, I'd like to implement an HTTP client, and I'd like to implement circuit-breaking, but I don't yet understand how our preferred client span double-wrapping is intended to behave.

If our middlewares were more generically factored, I could contribute without learning much more about the whole baseplate.go codebase. It's easier to create a circuitbreaking filter than it is to apply circuitbreaking to each of the disparate protocol interfaces that exist right now. Similarly, it's relatively easy to take a middleware chain from the thrift client, and apply the same middleware chain, perhaps with minor modifications, to the new http client. It's much harder to gather and port all of the middleware to a new interface.

The way things are factored today makes a ton of sense, especially as all of the contributors are baseplate.go experts, and we haven't had to scale to even a moderate number of protocols or middlewares. We've done an excellent job of keeping things simple!

My argument, much like any O(n) argument, is about scalability as the numbers get larger. We'll have more protocols, middlewares, and contributors in the future. As a new contributor to baseplate.go, I'm starting to feel some pain that is probably not felt by the original authors. And as someone who's worked for years in the environment I'm suggesting, I'm hopeful that it will help the project scale nicely into the region where we have, say, around 10 each of common protocols, contributors and middlewares.

fishy · 2020-06-15T21:23:27Z

but we will want to verify that the implementations are consistent across all of the protocols

I don't see how that works in reality. Blindly apply middlewares from one server (thrift) to another (http) is a very dangerous idea. For example, we extract deadline propagation headers on thrift server, but don't do so in http, because thrift servers has a very different trust model from http servers. If we ended up sending span information and deadline propagation information to all of redis/aws/etc., that's both wasteful (we are sending headers server are not going to read), and dangerous (we are leaking internal info to third party).

Instead, that should be achieved by code reuse, e.g. the core functionality is implemented middleware agnostic, and the middlewares are just a very thin wrapper applying the things can't be shared with other middleware implementations. This is basically where we are today.

As a new contributor to baseplate.go, I'm starting to feel some pain that is probably not felt by the original authors

That will always be true, no matter how "good" our code base is. I would argue that we do have very good code documentations and comments, but you are not looking at them. (for example your "how our preferred client span double-wrapping is intended to behave" question is answered here:

baseplate.go/thriftbp/client_middlewares.go

Lines 75 to 89 in 8630c76

    
           // Currently they are (in order): 
        
           // 
        
           // 1. ForwardEdgeRequestContext 
        
           // 
        
           // 2. MonitorClient with MonitorClientWrappedSlugSuffix - This creates the spans 
        
           // from the view of the client that group all retries into a single, 
        
           // wrapped span. 
        
           // 
        
           // 3. Retry(retryOptions) - If retryOptions is empty/nil, default to only 
        
           // retry.Attempts(1), this will not actually retry any calls but your client is 
        
           // configured to set retry logic per-call using retrybp.WithOptions. 
        
           // 
        
           // 4. MonitorClient - This creates the spans of the raw client calls. 
        
           // 
        
           // 5. SetDeadlineBudget

, or here: https://pkg.go.dev/github.com/reddit/baseplate.go/thriftbp?tab=doc#BaseplateDefaultClientMiddlewares)

konradreiche · 2020-06-15T21:58:56Z

I agree that middlewares shouldn't be applied across protocols unless they make sense but there are middlewares that are agnostic to their protocol, so I wonder if it's possible to support this pattern but also remain flexible by applying the middleware only to certain protocols without an override type of design.

fishy · 2020-06-15T22:04:30Z

@konradreiche sure, but at what price? The current situation is that the price is too high (that we have to deal with a lot of newly added interface{}), while the benefit is too low (those middlewares will be replaced by the new pattern are already very thin wrapper around already shared code).

Also not to mention that the current approach won't work with thrift server middlewares. It likely also don't work with thrift client middlewares. (if you don't believe me, try implement it for thrift).

fizx · 2020-06-15T23:08:03Z

I was hoping to have a little in-person discussion about this before it got into too much of a debate (hence why I draft mode'd and didn't invite reviewers), so I think I'm going to check out of the thread until we get a chance to discuss in-person.

And to be clear, I don't think there's anything wrong today. It's more of a "where do we want to evolve the system" sort of a discussion.

basic implementation of Filter/Service middleware

c32020a

fizx force-pushed the servicefilter branch from 8b07431 to c32020a Compare June 15, 2020 17:53

pacejackson requested changes Jun 15, 2020

View reviewed changes

change interface to include Context

aae91a2

fishy requested changes Jun 15, 2020

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Basic example of Filter/Service middleware #235

Basic example of Filter/Service middleware #235

fizx commented Jun 15, 2020

pacejackson left a comment •

edited

Loading

fizx commented Jun 15, 2020 •

edited

Loading

pacejackson commented Jun 15, 2020 •

edited

Loading

fizx commented Jun 15, 2020

konradreiche commented Jun 15, 2020

fishy left a comment

fizx commented Jun 15, 2020

fishy commented Jun 15, 2020 •

edited

Loading

konradreiche commented Jun 15, 2020

fishy commented Jun 15, 2020 •

edited

Loading

fizx commented Jun 15, 2020

Basic example of Filter/Service middleware #235

Are you sure you want to change the base?

Basic example of Filter/Service middleware #235

Conversation

fizx commented Jun 15, 2020

pacejackson left a comment • edited Loading

Choose a reason for hiding this comment

fizx commented Jun 15, 2020 • edited Loading

pacejackson commented Jun 15, 2020 • edited Loading

fizx commented Jun 15, 2020

konradreiche commented Jun 15, 2020

fishy left a comment

Choose a reason for hiding this comment

fizx commented Jun 15, 2020

fishy commented Jun 15, 2020 • edited Loading

konradreiche commented Jun 15, 2020

fishy commented Jun 15, 2020 • edited Loading

fizx commented Jun 15, 2020

pacejackson left a comment •

edited

Loading

fizx commented Jun 15, 2020 •

edited

Loading

pacejackson commented Jun 15, 2020 •

edited

Loading

fishy commented Jun 15, 2020 •

edited

Loading

fishy commented Jun 15, 2020 •

edited

Loading