📡 communication design patterns
the web, HTTP, DNS, SSH
RPC (remote procedure call)
SQL and database protocols
APIs (REST/SOAP/GraphQL)
clients sends a request
- the request structure is defined by both client and server and has a boundary
server parses the request
- the parsing cost is not cheap (e.g. json
vs. xml
vs. protocol buffers)
- for example, for a large image, chunks can be sent, with a request per chunk
Server processes the request
Server sends a response
Client parse the Response and consume
an example in your terminal
see how it always get the headers firsts:
curl -v --trace souza.xyz
== Info: Trying 76.76.21.21:80...
== Info: Connected to souza.xyz (76.76.21.21) port 80 (# 0)
=> Send header, 79 bytes (0x4f)
0000: 47 45 54 20 2f 20 48 54 54 50 2f 31 2e 31 0d 0a GET / HTTP/1.1..
0010: 48 6f 73 74 3a 20 6d 61 72 69 6e 61 73 6f 75 7a Host: souz
0020: 61 2e 78 79 7a 0d 0a 55 73 65 72 2d 41 67 65 6e a.xyz..User-Agen
0030: 74 3a 20 63 75 72 6c 2f 37 2e 38 38 2e 31 0d 0a t: curl/7.88.1..
0040: 41 63 63 65 70 74 3a 20 2a 2f 2a 0d 0a 0d 0a Accept: * /* ....
== Info: HTTP 1.0, assume close after body
< = Recv header, 33 bytes (0x21)
0000: 48 54 54 50 2f 31 2e 30 20 33 30 38 20 50 65 72 HTTP/1.0 308 Per
0010: 6d 61 6e 65 6e 74 20 52 65 64 69 72 65 63 74 0d manent Redirect.
0020: 0a .
< = Recv header, 26 bytes (0x1a)
0000: 43 6f 6e 74 65 6e 74 2d 54 79 70 65 3a 20 74 65 Content-Type: te
0010: 78 74 2f 70 6c 61 69 6e 0d 0a xt/plain..
< = Recv header, 36 bytes (0x24)
0000: 4c 6f 63 61 74 69 6f 6e 3a 20 68 74 74 70 73 3a Location: https:
0010: 2f 2f 6d 61 72 69 6e 61 73 6f 75 7a 61 2e 78 79 //souza.xy
0020: 7a 2f 0d 0a z/..
< = Recv header, 41 bytes (0x29)
0000: 52 65 66 72 65 73 68 3a 20 30 3b 75 72 6c 3d 68 Refresh: 0; url=h
0010: 74 74 70 73 3a 2f 2f 6d 61 72 69 6e 61 73 6f 75 ttps://sou
0020: 7a 61 2e 78 79 7a 2f 0d 0a za.xyz/..
< = Recv header, 16 bytes (0x10)
0000: 73 65 72 76 65 72 3a 20 56 65 72 63 65 6c 0d 0a server: Vercel..
< = Recv header, 2 bytes (0x2)
0000: 0d 0a ..
< = Recv data, 14 bytes (0xe)
0000: 52 65 64 69 72 65 63 74 69
Synchronous vs. Asynchronous workloads
Synchronous I/O: the basic idea
Caller sends a request and blocks
Caller cannot execute any code meanwhile
Receiver responds, Caller unblocks
Caller and Receiver are in sync
example (note the waste!)
program asks OS to read from disk
program main threads is taken off the CPU
read is complete and program resume execution (costly)
Asynchronous I/O: the basic idea
caller sends a request
caller can work until it gets a response
caller either:
- checks whether the response is ready (epoll)
- receiver calls back when it's done (io_uring)
- spins up a new thread that blocks
caller and receiver not in sync
Sync vs. Async in a Request Response
synchronicity is a client property
most modern client libraries are async
Async workload is everywhere
async programming (promises, futures)
async backend processing
async commits in postgres
async IO in Linux (epoll, io_uring)
async replication
async OS fsync (filesystem cache)
real-time
the client must be online (connected to the server)
the client must be able to handle the load
polling is preferred for light clients.
used by RabbitMQ (clients consume the queues, and the messages are pushed to the clients)
client connects to a server
server sends data to the client
client doesn't have to request anything
protocol must be bidirectional
used when a request takes long time to process (e.g., upload a video) and very simple to build
however, it can be too chatting, use too much network bandwidth and backend resources
client sends a request
server responds immediately with a handle
server continues to process the request
client uses that handle to check for status
multiple short request response as polls
a poll requests where the server only responds when the job is ready (used when a request takes long time to process and it's not real time)
used by Kafka
clients sends a request
server responds immediately with a handle
server continues to process the request
client uses that handle to check for status
server does not reply until has the response (and there are some timeouts)
one request with a long response, but the client must be online and be able to handle the response
a response has start and end
client sends a request
server sends logical events as part of response
server never writes the end of the response
it's still a request but an unending response
client parses the streams data
works with HTTP
Publish Subscribe (Pub/Sub)
one publisher has many reader (and there can be many publishers)
relevant when there are many servers (e.g., upload, compress, format, notification)
great for microservices as it scales with multiple receivers
loose coupling (clients are not connected to each other and works while clients not running)
however, you cannot know if the consumer/subscriber got the message or got it twice, etc.
also, it might result on network saturation and extra complexity
used by RabbitQ and Kafka
Multiplexing vs. Demultiplexing
used by HTTP/2, QUIC, connection pool, MPTCP
connection pooling is a technique where you can spin several backend connections and keep them "hot"
a very contentious topic: is state stored in the backend? how do you rely on the state of an application, system, or protocol?
stateful backend : store state about clients in its memory and depends on the information being there
stateless backend : client is responsible to "transfer the state" with every request (you may store but can safely lose it).
stateless backends can still store data somewhere else
the backend remain stateless but the system is stateful (can you restart the backend during idle time while the client workflow continues to work?)
the server generate a session, store locally, and return to the user
the client check if the session is in memory to authenticate and return
if the backend is restarted, sessions are empty (it never relied on the databases)
Stateless vs. Stateful protocols
the protocols can be designed to store date
TCP is stateful: sequences, connection file descriptor
UDP is stateless: DNS send queryID in UDP to identify queries
QUIC is stateful but because it sends connectionID to identify connection, it transfer the state across the protocol
you can build a stateless protocol on top of a stateful one and vice versa (e.g., HTTP on top of TCP, with cookies)
Complete stateless systems
stateless systems are very rare
state is carried with every request
a backend service that relies completely on the input
JWT (JSON Web Token) , everything is in the token and you cannot mark it as invalid
every protocol requires a library, but changing the library is hard as the app is entrenched to it and breaking changes backward compatibility
sidecar pattern is the idea of delegating communication through a proxy with a rich library (and the client has a thin library)
in this case, every client has a sidecar proxy
pros: it's language agnostic, provides extra security, service discovery, caching.
cons: complexity, latency
service mesh proxies (Linkerd, Istio, Envoy)
sidecar proxy container (must be layer 7 proxy)