From e462456e5ad063ea193fbde1ae8cef6a312d13ff Mon Sep 17 00:00:00 2001
From: Thibault Charbonnier <thibaultcha@me.com>
Date: Sat, 17 Mar 2018 01:34:15 -0700
Subject: [PATCH] fix(*) use dedicated shm for rate-limiting plugins

Addresses the issue discussed in #3124 and #3241. This is part of a
series of fixes to address those errors.

Context
-------
In the `local` mode of the rate-limiting plugins, storing the
rate-limiting counters in the same shm used by Kong's database cache is
too invasive for the underlying shm, especially when the rate-limiting
plugins are used with a `seconds` precision.

On top of exhausting the database cache slots, this approach also
generates some form of fragmentation in the shm. This is due to the
side-by-side storage of values with sizes of different orders of
magnitude (JSON strings vs. an incremented double) and the LRU eviction
mechanism. When the shm is full and LRU kicks-in, it is highly probable
that several rate-limiting counters will be evicted (due to their
proliferation), thus not freeing enough space to store the retrieved
data, causing a `no memory` error to be reported by the shm.

Solution
--------
Declaring shms that are only used by some plugins is not very elegant.
Now, all users (even those not using rate-limiting plugins) have to pay
a memory cost (although small).
Unfortunately, and in the absence of a more dynamic solution to shm
configuration such as a more dynamic templating engine, or a
`configure_by_lua` phase, this is the safest solution.

Size rationale
--------------
Running a script generating similar keys and storing similar values
(double) indicates that an shm with 12Mb should be able to store about
~48,000 of those values at once. It is important to remind ourselves
that one Consumer/IP address might use more than one key (in fact, one
per period configured on the plugin), and both the rate-limiting and
response-ratelimiting plugins at once, and they use the same shms.

Even considering the above statements, ~48,000 keys per node seems
somewhat reasonable, considering keys of `second` precision will most
likely fill up the shm and be candidates for LRU eviction. Our concern
lies instead around long-lived limits (and thus, keys) set by the user.

Additionally, a future improvement upon this will be the setting of the
`init_ttl` argument for the rate-limiting keys, which will help **quite
considerably** in reducing the footprint of the plugins on the shm. As
of this day, this feature has been contributed to ngx_lua but not
released yet:

    https://github.com/openresty/lua-nginx-module/pull/1226

Again, this limit only applies when using the **local** strategy, which
also likely means that a load-balancer is distributing traffic to a pool
of Kong nodes with some sort of consistent load-balancing technique.
Thus considerably reducing the number of concurrent Consumers a given
node needs to handle at once.

See also
--------
Another piece of the fixes for the `no memory` errors resides in the
behavior of the database caching module upon a full shm. See:

    https://github.com/thibaultcha/lua-resty-mlcache/pull/41

This patch reduces the likeliness of a full shm (by a lot!), but does
not remove it. The above patch ensures a somewhat still sane behavior
would the shm happen to be full again.

Fix #3124
Fix #3241
---
 kong/constants.lua                                   | 1 +
 kong/plugins/rate-limiting/policies/init.lua         | 2 +-
 kong/plugins/response-ratelimiting/policies/init.lua | 2 +-
 kong/templates/nginx_kong.lua                        | 1 +
 spec/fixtures/custom_nginx.template                  | 1 +
 5 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/kong/constants.lua b/kong/constants.lua
index d538dca681b7..9947280f9067 100644
--- a/kong/constants.lua
+++ b/kong/constants.lua
@@ -73,5 +73,6 @@ return {
     "kong_process_events",
     "kong_cluster_events",
     "kong_healthchecks",
+    "kong_rate_limiting_counters",
   },
 }
diff --git a/kong/plugins/rate-limiting/policies/init.lua b/kong/plugins/rate-limiting/policies/init.lua
index 86bbd0dd79b2..2e996e6fbfaf 100644
--- a/kong/plugins/rate-limiting/policies/init.lua
+++ b/kong/plugins/rate-limiting/policies/init.lua
@@ -4,7 +4,7 @@ local redis = require "resty.redis"
 local policy_cluster = require "kong.plugins.rate-limiting.policies.cluster"
 local reports = require "kong.core.reports"
 local ngx_log = ngx.log
-local shm = ngx.shared.kong_cache
+local shm = ngx.shared.kong_rate_limiting_counters
 
 local pairs = pairs
 local fmt = string.format
diff --git a/kong/plugins/response-ratelimiting/policies/init.lua b/kong/plugins/response-ratelimiting/policies/init.lua
index d6ed13e71178..a896b1fef71e 100644
--- a/kong/plugins/response-ratelimiting/policies/init.lua
+++ b/kong/plugins/response-ratelimiting/policies/init.lua
@@ -4,7 +4,7 @@ local redis = require "resty.redis"
 local policy_cluster = require "kong.plugins.response-ratelimiting.policies.cluster"
 local reports = require "kong.core.reports"
 local ngx_log = ngx.log
-local shm = ngx.shared.kong_cache
+local shm = ngx.shared.kong_rate_limiting_counters
 
 local pairs = pairs
 local fmt = string.format
diff --git a/kong/templates/nginx_kong.lua b/kong/templates/nginx_kong.lua
index 5639f3195b44..07a3e35b2fe0 100644
--- a/kong/templates/nginx_kong.lua
+++ b/kong/templates/nginx_kong.lua
@@ -33,6 +33,7 @@ lua_shared_dict kong_cache          ${{MEM_CACHE_SIZE}};
 lua_shared_dict kong_process_events 5m;
 lua_shared_dict kong_cluster_events 5m;
 lua_shared_dict kong_healthchecks   5m;
+lua_shared_dict kong_rate_limiting_counters 12m;
 > if database == "cassandra" then
 lua_shared_dict kong_cassandra      5m;
 > end
diff --git a/spec/fixtures/custom_nginx.template b/spec/fixtures/custom_nginx.template
index 24425eb8f306..5dd1c554907f 100644
--- a/spec/fixtures/custom_nginx.template
+++ b/spec/fixtures/custom_nginx.template
@@ -46,6 +46,7 @@ http {
     lua_shared_dict kong_process_events 5m;
     lua_shared_dict kong_cluster_events 5m;
     lua_shared_dict kong_healthchecks   5m;
+    lua_shared_dict kong_rate_limiting_counters 12m;
 > if database == "cassandra" then
     lua_shared_dict kong_cassandra      5m;
 > end