Replies: 3 comments 1 reply
-
Related issues from cartridge project: Raft failover, vshard router master=auto |
Beta Was this translation helpful? Give feedback.
-
Why do you need this duplication then? You can subscribe for the
It will be changed only when one change the config on the node, isn't it? And you propose to check if it is |
Beta Was this translation helpful? Give feedback.
-
It was decided not to patch vshard's Instead, it is going to be extracted into an external module The good side is that it will probably improve code re-usage between vshard and net.replicaset. Bad sides are:
|
Beta Was this translation helpful? Give feedback.
-
The related issue is #313. The discussion starts with a description of how the task looks in my understanding. Then I provide my vision of API and behaviour, some insights at internals, frequent questions, alternatives.
Problems with how it works now
Router needs to find which replica in each replicaset is a master. It used to be done manually via router's config, but it is too fragile. In case of any config update issue the router would be stuck having wrong information about who is the master. The least of problems would be that it wouldn't be able to execute write-requests to that replicaset.
Then
master = 'auto'
feature was introduced. Router became able to discover master automatically. It solves the problem on a high-level, but has an issue with being polling-based. It means, that discovery happens with a certain period by calling a function on the storages.If a change happens between the discovery calls, the router won't see it. To mitigate the problem the router does implicit master discovery hacks. For example, an error about a replica being read-only also contains who the replica considers master. The router then might be able to switch it and retry the original user's request transparently.
Besides, polling-based solution is just complicated code-wise. Need to send the discovery requests in a fiber, because netbox doesn't support async requests with callbacks. Additionally, even if netbox would support it, that would be expensive. Because such request would need to be a long-polling request sleeping on the storage waiting for changes. It would be -1 to
box.cfg.net_msg_max
and +1 fiber not doing anything for each router in the cluster on every replica.How it should work
Since introduction of box watchers feature (appears in >= 2.10.0-beta2) Tarantool is able to provide subscription endpoints. An arbitrary key-value pair can be set on Tarantool instance using
box.broadcast(key, value)
API. The local users and remote connections can subscribe on thekey
and get updates every time whenvalue
changes. These APIs arenetbox_conn:watch()
andbox.watch()
.Tarantool will provide some built-in events in the future, but currently it is not done even in master.
The idea is that vshard should define its own events. They would contain vshard-specific info based on the planned built-in events. The first proposed one is:
term
,role
,leader
,is_ro
are related to automatic leader election. It is not supported for now in vshard, but it will be in the future. Better expose these fields right away. They repeat the future planned built-in event calledbox.election
.The field
is_master_cfg
is vshard-specific. It is not the same as 'leader' or even as being actually writable. It is just what is specified in storage's config in its[replica_uuid] = {master = <bool>, ...}
.Router will use this field to detect who is the master when it is present. For automatic elections the idea is that
is_master_cfg
will be just nil.API and behaviour
Storage
vshard.storage.election
is public storage event available for subscriptions. It will be documented and will be used by the routers. Other connectors are free to use it as well. Although some fields might be documented as 'risky' to rely on. For example,is_master_cfg
might disappear in some version far in the future or become optional.The event will work for versions >= 2.10.0-beta2. Fired every time when any of the event fields change.
Before vshard storage is configured the event will return
nil
. Like documented inbox.broadcast
/box.watch
. Connectors just need to be ready to that and treat it like the event never happened yet. Or like the storage simply is of an older version and does not have this event at all.When subscription is tried to be done on an instance < 2.10.0-beta2, it will fail with an unknown request type error. Netbox is able to handle it. Other connectors should adapt too if they want subscriptions.
Router
Router can't drop polling-based master discovery. Because it supports Tarantool versions >= 1.10.1 and vshard storage versions <= 0.1.19. The polling is not going anywhere in the foreseeable future.
It means the router will need to support both polling and events. However it might be not as complicated as it sounds.
Consider connections to one replicaset. They have
on_connect
triggers installed by the router. In the trigger it is possible to checkconn.peer_protocol_features.watchers
field. By the time the trigger is called the protocol features are already revealed.For each connection in
on_connect
check ifpeer_protocol_features.watchers
is true, then subscribe onvshard.storage.election
. If any replica does not support watchers and until any non-nil data is received for this event for each connection, the entire replicaset uses polling for master discovery. Having it working on a subset of nodes wouldn't simplify anything. Now assume it is true on all connections.Assume now that all replicas received something for
vshard.storage.election
. Then the last replica will switch the replicaset to event-based master discovery. Polling won't be used for it.In case of a disconnect the replica is considered in the same state until a reconnect happens. Except that it probably will stop being considered master.
When a new event arrives and the replica becomes a master or stops being a master, it changes
replicaset.master
field right from the event callback.The same procedure happens for each replicaset.
To sum up:
master = 'auto'
in the router's config, like before. The router will choose polling vs events internally.FAQ
Why is the event called
vshard.storage.election
instead ofvshard.election
?I think the router in the future might want to expose own events too. One of their features is that they don't need
box.cfg
to be called. Which means the router could potentially have own events. Given that router and storage can be hosted in the same process, it seems logical to split their event namespaces:vshard.storage.*
andvshard.router.*
.Alternatives
Netbox async call with a callback
There was an idea (still is) to make netbox able to take a callback along with
is_async
option and call it when the request is finished. It would allow to send long-polling requests to storages waiting for changes and doingreturn
when something happens. On the router the result would be processed in a callback.There are problems with that when try to use it as subscriptions:
box.cfg.net_msg_max
on the storage;Subscriptions don't have any of these problems.
Split vshard versions like Tarantool does
Develop vshard in 2 branches. First for Tarantool < 2.10.0-beta2. Second for >= the version. The first one won't have event support in its code at all. This is the same what is in master now. The second won't have polling at all. All will be event-based.
This might simplify the code. However it is a critical measure in case of serious complications with polling and events cooperation. It might simplify each version, but in total it means twice more code to support, more packages to produce, more CI work.
Beta Was this translation helpful? Give feedback.
All reactions