Sentinel #39

mkurkov · 2013-04-18T16:33:45Z

Hi, I have added initial support for redis sentinel failover.

I decide to add it to eredis instead of standalone lib, what do you think about it?
For now sentinel is experimental feature but it should be main aproach to monitor redis clusters and many already use it in production.

Also I see some issues with current implementation of eredis:

Shouldn't we have to check Socket when processing {tcp, XXX} messages and ignore data from already closed socket. With sentinel we can start reconnecting to new master in the middle of processing reply for example.
When cleaning queue of waiting clients eredis don't send them any notification, maybe it will be better to send them error reply in this case.

Thanks.

…ests

knutin · 2013-04-18T21:07:15Z

Thanks for the patch! I will need to dig into sentinel a bit to fully understand your patch.

As to the points you raise, you are right. I will at some point make those changes.

knutin · 2014-05-19T07:42:33Z

Hi, sorry for leaving this open forever. As we now know that Sentinel doesn't work very well (http://aphyr.com/posts/283-call-me-maybe-redis) I'll close this pull request.

antirez · 2014-05-19T14:50:53Z

Hello Knut,

I believe you should reconsider your position, and here is my arguments about why you should.

In the article you linked Aphyr shows that Redis instances + Redis sentinel failover is not a consistent system, and that a lot of writes are lost during partitions. This is the main argument you use to don't merge Sentinel support. However there are two important points to examine here.

As Aphyr himself can confirm you, or any other person with basic distributed systems knowledges, you can't build a consistent system with asynchronous replication. Similarly, you can't build an high performance system with synchronous replication and majority quorum. So this is an expected result, but it does not make Sentinel useless. Basically most of the failover based systems out there have the same semantics.
However what Aphyr also showed with his analysis, is that the Sentinel implementation, even under the theoretical limits of a failover system composed of master nodes and asynchronous replication with slaves that are elected when the master fails, was not good enough, since there are no reasons to diverge forever (even if you can't avoid to diverge for some time).

So the current Redis Sentinel is a complete reimplementation with new algorithms compared to what was examined by Aphyr, and the changes are mainly the following two:

Now Sentinel configuration propagation (basically what is the current master) is handled with more robust and simple to analyze algorithms that have specific safety and liveness properties. They are clearly documented in the Sentinel documentation.
Redis replication was improved so that it is possible to stop accepting writes when the master is (asynchronously) not able to get acknowledges from N slaves for M seconds (see the example redis.conf in the Redis distribution for more info about the exact configuration).

With this changes, you have a clear semantics about how Sentinels propagate changes and agree about performing a failover (TL;DR it is an eventually consistent system where in every given partition the higher configuration version wins, and where majority is required to start a failover). Moreover because you can configure Redis instances to stop accepting writes if there are not specific conditions about replications, you have now an option to limit the write loss to a given window during a network partition, instead of having it unbound.

For example, using the right configuration, if a the master gets partitioned away with clients in a minority partition, while the majority partition will promote a new master, the old master will stop accepting writes in the minority partition after some time.

You can drop Sentinel and implement your Zookeeper Based failover, and you still will have the same semantics, since the limit is what master-slave + failover + asynchronous replications can give you. But at the same time this is what makes Redis fast and able to support complex data structures.

knutin · 2014-05-20T08:06:38Z

Hi Salvatore,

@antirez Thanks for taking the time to participate in this discussion. I was not aware of the changes to Sentinel.

@mkurkov Do you think it is possible to have the Sentinel support as a separate library. We can make some changes to the reconnect logic of Eredis to make it work.

sdebnath · 2015-08-12T17:01:59Z

It's been a year since this patch was introduced. Based on Redis documentation, Sentinel is the official high availability solution for Redis. Sentinel2 is now the current release and is available with both redis 2.8 & 3.x. Would love to see this go in either as a separate library or integrated into eredis. More and more production redis deployments rely on sentinel to provide client connectivity to the current redis master.

mmmries · 2016-02-24T05:37:08Z

@mkurkov I'm very interested in this project. Did you ever spin it up as a separate library that depends on eredis? I'd be happy to help out with getting it kicked off in a separate repo and use something like rebar3 to make it easily usable by both erlang and elixir projects

savonarola · 2017-02-13T10:21:26Z

Hello!

As far as I can understand Sentinel logic, things have changed since the time PR was done.

Now the logic of Sentinel is much simpler from the perspective of the client, since Redis server just drops connections when it is under Sentinel control and its role changes.

Hence, to have Sentinel support we mainly need a some sort of "factory" which we use each time we want connect to Redis. Such a library can be easily implemented as a standalone project; there is also an example of such a library in https://github.com/miros/eredis_sentinel.

benbro · 2020-09-11T05:19:14Z

@savonarola can you please explain how to use such library with eredis?
Normally we use eredis with:

{ok, C} = eredis:start_link().
{ok, <<"bar">>} = eredis:q(C, ["GET", "foo"]).

How will it work with a factory that is using Sentinel to create the connection?

mkurkov added 7 commits April 16, 2013 18:20

Add eredis_sentinel_client/masters auxilary modules

ffe10bb

Add eredis_sentinel main module with test specs

31e71af

Add sentinel support to eredis_client

9a84815

Add check that sentinel is installed on the system or skip sentinel t…

a19ebd7

…ests

Add sentinel chapter to README.md

9afd7c2

Fix remove stalled pids from subscriber list of sentinel master

1f36b98

Clean up sentinel part in README.md

80c2ffd

knutin closed this May 19, 2014

knutin reopened this May 20, 2014

benbro mentioned this pull request Aug 22, 2021

Sentinel support Nordix/eredis#43

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sentinel #39

Sentinel #39

mkurkov commented Apr 18, 2013

knutin commented Apr 18, 2013

knutin commented May 19, 2014

antirez commented May 19, 2014

knutin commented May 20, 2014

sdebnath commented Aug 12, 2015

mmmries commented Feb 24, 2016

savonarola commented Feb 13, 2017 •

edited

Loading

benbro commented Sep 11, 2020

Sentinel #39

Are you sure you want to change the base?

Sentinel #39

Conversation

mkurkov commented Apr 18, 2013

knutin commented Apr 18, 2013

knutin commented May 19, 2014

antirez commented May 19, 2014

knutin commented May 20, 2014

sdebnath commented Aug 12, 2015

mmmries commented Feb 24, 2016

savonarola commented Feb 13, 2017 • edited Loading

benbro commented Sep 11, 2020

savonarola commented Feb 13, 2017 •

edited

Loading