You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hedged requests: send the same requests to multiple servers, and use whatever response comes back first. To avoid doubling or tripling your computation load though, don’t send the hedging requests straight away:
defer sending a secondary request until the first request has been outstanding for more than the 95th-percentile expected latency for this class of requests. This approach limits the additional load to approximately 5% while substantially shortening the tail latency.
The text was updated successfully, but these errors were encountered:
Note that there may be different kinds of queries that have different latency histograms (e.g. a plain search, a faceted search or a spelling suggestions). They need to be differentiated when deciding to start a second request.
From http://blog.acolyer.org/2015/01/15/the-tail-at-scale/:
Hedged requests: send the same requests to multiple servers, and use whatever response comes back first. To avoid doubling or tripling your computation load though, don’t send the hedging requests straight away:
defer sending a secondary request until the first request has been outstanding for more than the 95th-percentile expected latency for this class of requests. This approach limits the additional load to approximately 5% while substantially shortening the tail latency.
The text was updated successfully, but these errors were encountered: