Thundering herd avoiding LoaderFunc on misses #26

comunidadio · 2023-12-24T02:46:10Z

Clarification and motivation

My go-to cache library for Go so far has been bluele/gcache - probably much less performant than many other alternatives but it has a very handy helper function that handles coordinating loads to avoid thundering herd.
https://pkg.go.dev/github.com/bluele/gcache#readme-loading-cache

Acceptance criteria

Builder could have a new WithLoaderFunc(func(K) V).
Calling Get(key_not_loaded_yet) in multiple goroutines concurrently will call the optional LoaderFunc once and then return to all goroutines all at once.

maypok86 · 2023-12-24T08:44:45Z

Lol, looks like I woke up famous. 😅

@comunidadio yes, it's very likely otter will get that feature as well. Just now I'm following caffeine author's words "After reaching 50-60 million qps throughput you can already focus on improving hit ratio and adding different features". 50 million qps on caffeine benchmarks is already achieved, soon otter will get an update after which it will be able to show throughput of 150+ million qps (thanks Ben), which is already much more like caffeine's performance. After that I planned to finally comment my code, because I completely ignored it during the third rewrite of the project. And next will come the features: callbacks, constant ttl expiration policy and loader functions too.

ben-manes · 2023-12-24T19:17:39Z

I believe you meant a cache stampede, which is often confused with the thundering herd problem but that is slightly different. It's important to note that if you also use a remote cache then the problem exists there too, but across servers instead of virtual threads. That can be solved by using a soft-hard expiration through probabilistic early expiration or scaled TTLs.

For a local cache you might consider supporting both blocking and non-blocking singular and batch loads. The synchronous variant can also be strengthened to support linearizable computations, e.g. to compute a new value with the previous as a blocking write operation. That makes it very flexible to adapt the cache to the application's need and facilitate features like entry pinning. You can also perform implicit asynchronous loads, like automatically refreshing a popular entry before it expires to avoid the latency spike if evicted. The batch asynchronous load can also be made implicit by coalescing independent singular loads over a short duration. The backoff, timeout, retrial, and fallback of these loads can be delegated to a resilience library.

atombender · 2024-11-01T19:38:20Z

I think coalescing would make sense to add to this library, as it's needed to write read-through caching code. I'm in the market for a better cache library for Go, but the candidates I've looked at generally don't have read-through support.

You can cobble it together yourself (here using the Go singleflight library):

var sfg singleflight.Group

func Get(key string) (any, error) {
	val, err, _ := sfg.Do(key, func() (any, error) {
		val, ok := cache.Get(key)
		if ok {
			return val, nil
		}
		// ... fetch ...
		cache.Set(key, val)
		return val, nil
	})
	return val, err
}

However, since this code executes outside the cache, this requires a lock to grab the coalescing slot before you even know if the key is there. So you really want something like:

var sfg singleflight.Group

func Get(key string) (any, error) {
	val, ok := cache.Get(key)
	if ok {
		return val
	}

	val, err, _ := sfg.Do(key, func() (any, error) {
		val, ok := cache.Get(key)
		if ok {
			return val, nil
		}

		// ... fetch ...

		cache.Set(key, val)
		return val, nil
	})
	return val, err
}

…but this locks probably more than necessary. If the cache itself can own all the locking, you can probably make this more efficient.

Serving stale data on error is another feature that is lovely to have, and dovetails with the above. If the "fetch" function fails, the cache should still be able to serve the previous value for a while, but only up to a certain maximum duration.

douyux · 2024-12-28T07:31:41Z

During the execution of fetch and set in singleflight, if there are concurrent delete or clear operations, it may result in the cache dirty values.

ben-manes · 2024-12-28T07:52:49Z

The simplest approach is to use lock striping to guard the write operations and block other writes during the load. This requires some tuning as it allows distinct keys to hash to the same lock. That’s typically fine since reads don’t need to lock, writes are rarer in a cache with a good hit rate, and the max size hints to a reasonable number of locks to allocate. Now that Go supports weak references, a per key lock using dynamic striping is possible.

As a delete must acquire this lock it can maintain linearizability to avoid that problem with the singleflight example.

comunidadio added the enhancement New feature or request label Dec 24, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Thundering herd avoiding LoaderFunc on misses #26

Thundering herd avoiding LoaderFunc on misses #26

comunidadio commented Dec 24, 2023

maypok86 commented Dec 24, 2023

ben-manes commented Dec 24, 2023

atombender commented Nov 1, 2024 •

edited

Loading

douyux commented Dec 28, 2024

ben-manes commented Dec 28, 2024

Thundering herd avoiding LoaderFunc on misses #26

Thundering herd avoiding LoaderFunc on misses #26

Comments

comunidadio commented Dec 24, 2023

Clarification and motivation

Acceptance criteria

maypok86 commented Dec 24, 2023

ben-manes commented Dec 24, 2023

atombender commented Nov 1, 2024 • edited Loading

douyux commented Dec 28, 2024

ben-manes commented Dec 28, 2024

atombender commented Nov 1, 2024 •

edited

Loading