Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The semantics of the nuraft::snapshot #509

Open
unixod opened this issue May 20, 2024 · 3 comments
Open

The semantics of the nuraft::snapshot #509

unixod opened this issue May 20, 2024 · 3 comments

Comments

@unixod
Copy link
Contributor

unixod commented May 20, 2024

Hi,

Is it correct to assume that for each call of read_logical_snp_obj(s, ...) made by NuRaft, the following holds?

  1. The first parameter s contains an information about raft log entry at which latest snapshot was made;
  2. The value of first parameter s is equal to the one which was passed to latest call of create_snapshot (i.e. s.get_last_log_idx() is equal to index of latest log entry included into latest snapshot);

To be more general, I'm wondering what information is passed as a first parameter to functions:

  • create_snapshot(snapshot&, ...)
  • read_logical_snp_obj(snapshot&, ...)
  • save_logical_snp_obj(snapshot&, ...)
  • apply_snapshot(snapshot&, ...)
@greensky00
Copy link
Contributor

greensky00 commented May 22, 2024

Hi @unixod

  1. s contains the information of the snapshot that the current NuRaft instance (leader) is transmitting to its follower. It can be the latest snapshot, or older than that.
  2. Same as above, it can be older than the latest snapshot. More precisely, it is the snapshot that was the latest one at the moment the leader began the snapshot transmission. More snapshots can be created since then, but s will remain unchanged unless a new snapshot transmission begins.

In short,

  • create_snapshot: s contains the latest snapshot info, that the state machine has to create.
  • read_logical_snp_obj: s contains the snapshot that the current leader is transmitting. s will remain the same until the entire transmission session (consists of many read_logical_snp_obj calls) either succeeds or fails. If it fails, read_logcial_snp_obj will be retried with newer s.
  • save_logical_snp_obj, apply_snapshot: same as read_logical_snp_obj.

@grigoryan-sergey
Copy link

grigoryan-sergey commented May 29, 2024

Hi @greensky00

Thanks, the answer clarified a lot of things. But I still have several questions.

  1. Looking at the calculator example I observed that the state machine stores only last 3 snapshots. How was this number determined? Is it safe to store any number of snapshots, or there is a limit for it? Will everything work correctly, if we store just one snapshot for each node?
  2. Looking at read_logical_snp_obj function. There is an if block there, and I am wondering what are the concrete cases that we can enter that block?
  3. Also why do we return 0 in that block if that is considered to be failure of snapshot reading (I mean why don't we return a negative value)?
  4. Where does NuRaft take that first snapshot argument from? Am I correct that it gets it from calling last_snapshot on the leader node?
  5. Because the reference of the first snapshot argument is non const in all of the functions above, I suppose that user can modify it. Am I correct and if yes, what are those cases when the first argument can be modified?

@greensky00
Copy link
Contributor

Hi @grigoryan-sergey

  1. You can store any arbitrary number of snapshots if your state machine can support it. Keeping only one (the last one) snapshot should also be fine. But please note that, as I mentioned above, the snapshot currently being transferred should be kept until the end of the transmission.
  2. There should be no such case entering read_logical_snp_obj when no snapshot exists. That code is to smoothly get over such a situation without corrupting the system.
  3. I realized the example is wrong; it should return a negative number instead of 0. I will fix it; thanks for bringing it up.
  4. That's correct.
  5. There is no case where the state machine has to modify the given snapshot instance. read_logical_snp_obj just inherited the legacy read_snapshot_data from Cornerstone, which is why it is non-const. But basically you should regard it as read-only.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants