-
Notifications
You must be signed in to change notification settings - Fork 155
Design Doc: Caching compressed content in PSOL
Joshua Marantz, 2013-01-15
Currently, all content is stored in our CacheInterface as uncompressed. The benefits of smartly compressing data in our cache are:
- Better use of available cache space, improving hit rate
- Reduced overhead transferring data from disk or network
- Increased likelihood of L1 hits because data is smaller.
- Reduced CPU time when serving output resources to clients that can accept gzip
Drawbacks include:
- Increased CPU time compressing/decompressing some cache entries that must interpreted in the clear, including input resources, pcache, and metadata.
- Increased code complexity
There are several categories of content we store in the cache
- Textual input resources (CSS, JS) and compressible images (GIF)
- Already-compressed input images (jpg, png)
- Textual output resources (CSS, JS). [We don’t store GIF in our output cache]
- Already-compressed output images (jpg, png, webp)
- Metadata protobufs (some of which have inlined image/css/js data)
- Property Cache data
- Client Cache data
- Device Cache data
This data is all stored uncompressed.
I would like to pursue an incremental strategy where we start just by compressing OutputResource objects with content-types other than png/jpeg/webp as they are generated, adding the proper Content-Encoding header in the HTTPValue. We would not need to decompress them as they are served. We’d experiment with making sure that Apache serves cleartext to clients that don’t have Accept-Encoding:gzip, inflating the content ourselves while serving if we observe that Apache does not do that automatically.
In MPS I think this would be an instant win, and we should be able to observe this in our load-tests as an improved L1 cache hit rate.
We’d want to make new cache-compression flag-controlled so we can take measurements and make judgements.
mod_pagespeed now integrates compressed caching of metadata, though the capability is not advertised to users yet. Compressed caching of HTTP is a different problem and has still not yet been implemented.