Skip to content

Commit

Permalink
fix(storage-manager): incorrectly revive key after deletion
Browse files Browse the repository at this point in the history
This commit fixes a bug where a key would be incorrectly "revived" in a
scenario where a deletion is followed by a Wildcard Put.

Assuming we have two Replicas, R1 and R2, the problematic scenario was:

0. Replicas R1 and R2 are disconnected.
1. `z_put "a" = 1 @t1` on Replica R1.
2. `z_delete "a" @t2` on Replica R2.
3. `z_put "**" = 42 @t3` on Replica R2.
4. Connect Replicas R1 and R2.

After aligning, before this fix, Replicas R1 and R2 would have `"a" = 42
@t3` in their storage whereas there should have been no key associated
to `"a"` because of the delete at `@t2`.

This commit fixes this bug.

* plugins/zenoh-plugin-storage-manager/src/replication/core/aligner_reply.rs:
  instead of eagerly applying a Wildcard Update that should not be
  applied, do nothing. The comments were updated to provide more
  context.

Signed-off-by: Julien Loudet <[email protected]>
  • Loading branch information
J-Loudet committed Jan 6, 2025
1 parent f0bdbea commit daf0b56
Showing 1 changed file with 14 additions and 10 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -505,18 +505,22 @@ impl Replication {
return false;
}
}
// The timestamp of the Wildcard Update is greater than that of the Event, we apply the
// Wildcard Update.
(Some((wildcard_ke, wildcard_update)), Some(log_event))
// The timestamp of the Wildcard Update is greater than that of the Event: either the
// Event is a deletion or we have a bug with a Wildcard Update that did not override
// all the events it should have had.
//
// The rationale is the following: a Wildcard Update (be it `Action::WildcardPut` or
// `Action::WildcardDelete`) should override all `Action::Put`. When so doing, it will
// also update the timestamp of the overridden event, putting its own. We would thus
// have `wildcard_update.timestamp() == log_event.timestamp()`.
//
// Only an event with an `Action::Delete` will not be overridden. It is thus the only
// possibility to have both a newer event and a newer Wildcard Update. In that scenario,
// there is nothing to do: the remote Replica is the one that is not up to date.
(Some((_, wildcard_update)), Some(log_event))
if wildcard_update.timestamp() > log_event.timestamp() =>
{
self.store_event_overridden_by_wildcard_update(
replication_log_guard,
replica_event.clone(),
wildcard_ke,
wildcard_update,
)
.await;
debug_assert!(matches!(log_event.action, Action::Delete));
return false;
}
// The timestamp of the Event in the Replication Log is greater or equal to the Wildcard
Expand Down

0 comments on commit daf0b56

Please sign in to comment.