You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
All cache parameters are optional. By default the cache is disabled. If enabled cache, the system will try to get cache related config. If the config not present, the system will use default values.
Config parameters
eviction_policy - determines how to do eviction when the capacity limit is reached. ttl - determines how long an item is hold by the cache. capacity - the soft limit on the cache volume, it will be overridden by hard limit if it is large than hard limit.
Disable cache
PUT /_plugins/_ml/models/<model_id>
{
"cache_enabled": false
}
Disabling cache for a model will remove all data associated with the model in the cache.
We build cache on top of OpenSearch index functionality. To simplify the design, we don’t introduce a new distributed cache like redis or memorycache into the cluster, we use OpenSearch index as the store for caching.
Remote cache
We leverage on our existing connector mechanism to access remote cache service such as Elasticache to build a remote cache for customers. We need a new type of connector for cache, not model. As we don’t need predict action, we need get/set actions on connector.
@jngz-es, this feature looks good. Several questions:
Are we going to support exact match or semantic match? If only exact match supported, do we have expected hit rate on this?
Do we need to support enable/disable the cache reading on the fly? E.g. I might don't want cache data for a question because I'm seeking a different answer.
Do we need to add user_id(if user_id shows up) in the cache key to avoid leaking private data?
Problem statement
Usually model inference is expensive, especially for large models.
Motivation
With the caching feature, we can
Proposed Design
Phase 0
We allow user to enable cache feature for models.
Enable cache
All cache parameters are optional. By default the cache is disabled. If enabled cache, the system will try to get cache related config. If the config not present, the system will use default values.
Config parameters
eviction_policy
- determines how to do eviction when the capacity limit is reached.ttl
- determines how long an item is hold by the cache.capacity
- the soft limit on the cache volume, it will be overridden by hard limit if it is large than hard limit.Disable cache
Disabling cache for a model will remove all data associated with the model in the cache.
Update cache
Storage
We leverage on OpenSearch index to store the data.
Cache key
Model id + model config + user input
Cleanup
Security
Leverage on the existing model permission control for cache access permission control.
Phase 1
We introduce new cache APIs as a cache service to
API
Create cache
Get cache meta
Delete cache
Cache set
Set stores
create_time
field automatically for ttl calculation.Cache get
If ttl expires, return null and remove the key from the cache.
Cache delete
Cache types
Local cache
We build cache on top of OpenSearch index functionality. To simplify the design, we don’t introduce a new distributed cache like redis or memorycache into the cluster, we use OpenSearch index as the store for caching.
Remote cache
We leverage on our existing connector mechanism to access remote cache service such as Elasticache to build a remote cache for customers. We need a new type of connector for cache, not model. As we don’t need predict action, we need get/set actions on connector.
An example connector
Use case example
Cache for models
The text was updated successfully, but these errors were encountered: