You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Description: Upgrading to latest redis-py from redis-py-cluster causes a large imbalanced 10x increase in connection count on only shard 1 node 1. The RedisCluster client is created as follows RedisCluster(host=aws_configuration_endpoint) using a configuration endpoint which redirects to a "random" redis node. This connection count problem happens with redis engine 6.2.6 but not 5.0.6
Suspected Root Cause of elevated connection count: We know the initial redis cluster command is issued to an effectively random redis node because of how the configuration endpoint works. This means there are additional cluster commands issued during RedisCluster client initialization (or somehow multiple connections being opened) to the first node returned in the cluster slots list (which for engine 6.2+ is node 0001). Assuming additional redis calls are needed for client initialization, then ideally we would reuse the existing node we made the initial cluster slots call against or select a random one.
Observations:
Elevated connections only on node 0001 (other primary nodes on other shards are normal)
redis-py get_default_node() behavior
redis engine 6.2.6 - always returns the same node 0001
redis engine 5.0.6 - returns a seemingly random node
redis-cli cluster slots ordering of list of slots and nodes
redis engine 5.0.6 uses "random" ordering
More specifically ordering is stable for a given node (calling the same node multiple times results in the getting back the same list ordered the same way), but each node has it's own seemingly random ordering. So as long as the cluster slots command is issued to a random node each time (which is the case when using the configuration endpoint), then effectively the response appears to be a random list
redis engine 6.2.6 uses sorted ordering
Regardless of which node is called, the first slot in cluster slots is always slot 0 and as a result it's always node 0001
redis-py always sets the default node to the first node returned by cluster slots (code link)
This is easy to update however calling replace_default_node() post-init of client does not fix the connection count issue. This probably means the root cause is during initialization where the client issues additional commands to the default node (unsure if it's before or after self.default_node is set)
Sample Code: This does not reproduce the high connection count, but it does show the cluster slots sorting behavior and the client bias for the first node in the list.
from redis import RedisCluster
r5_nodes = []
r6_nodes = []
redis_5_0_host = "" # fill in with configuration endpoint for cluster running redis 5.0.6
redis_6_2_host = "" # fill in with configuration endpoint for cluster running redis 6.2.6
for i in range(20):
r5_client = RedisCluster(host=redis_5_0_host, port=6379)
r6_client = RedisCluster(host=redis_6_2_host, port=6379)
r5_nodes.append(r5_client.get_default_node().host)
r6_nodes.append(r6_client.get_default_node().host)
set(r5_nodes) # Prints most/all of the primary nodes in the cluster
set(r6_nodes) # Prints only one node
The text was updated successfully, but these errors were encountered:
Version: redis-py 5.2.1 - redis engine 6.2.6
Platform: Python 3.9 on Debian 12 / AWS
Description: Upgrading to latest redis-py from redis-py-cluster causes a large imbalanced 10x increase in connection count on only shard 1 node 1. The RedisCluster client is created as follows
RedisCluster(host=aws_configuration_endpoint)
using a configuration endpoint which redirects to a "random" redis node. This connection count problem happens with redis engine 6.2.6 but not 5.0.6Suspected Root Cause of elevated connection count: We know the initial
redis cluster
command is issued to an effectively random redis node because of how the configuration endpoint works. This means there are additional cluster commands issued duringRedisCluster
client initialization (or somehow multiple connections being opened) to the first node returned in thecluster slots
list (which for engine 6.2+ is node 0001). Assuming additional redis calls are needed for client initialization, then ideally we would reuse the existing node we made the initialcluster slots
call against or select a random one.Observations:
get_default_node()
behaviorredis-cli cluster slots
ordering of list of slots and nodescluster slots
command is issued to a random node each time (which is the case when using the configuration endpoint), then effectively the response appears to be a random listcluster slots
is always slot 0 and as a result it's always node 0001cluster slots
(code link)replace_default_node()
post-init of client does not fix the connection count issue. This probably means the root cause is during initialization where the client issues additional commands to the default node (unsure if it's before or afterself.default_node
is set)Sample Code: This does not reproduce the high connection count, but it does show the cluster slots sorting behavior and the client bias for the first node in the list.
The text was updated successfully, but these errors were encountered: