Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Resharding V3 - add a few state sync details #573

Open
wants to merge 2 commits into
base: resharding
Choose a base branch
from

Conversation

marcelo-gonzalez
Copy link

No description provided.

neps/nep-0568.md Outdated
Comment on lines 140 to 141
When nodes sync state (either because they've fallen far behind the chain, or because they're going to become a chunk producer for a new shard in a future epoch), they first identify a point in the chain they'd like to sync to. This is always the first block of the current epoch, which the node should be aware of once it has synced headers to the current point in the chain. The hash of this first block is referred to as the "sync_hash" in many places in the state sync implementation. Then the node makes a request (currently to centralized storage on GCS, but in the future to other nodes in the network) for a `ShardStateSyncResponseHeader` corresponding to that "sync_hash" and
the Shard ID of the shard it's interested in. Among other things, this header includes the last new chunk before "sync_hash" in the shard, and a `StateRootNode` with hash equal to that chunk's `prev_state_root` field. Then the node downloads (again from GCS, but in the future it'll be from other nodes) the nodes of the trie with that `StateRootNode` as its root. Afterwards, it applies new chunks in the shard until it's caught up.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This section may be a bit too technical for the specification. Perhaps you can move the implementation details to the "reference implementation" section below and here keep it at a higher level?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yea good point, done

Copy link
Contributor

@wacban wacban left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

After merging into the main PR please see if there are any lint errors and fix those. For some reason those only show up on the main RP.


The state sync algorithm defines a `sync_hash` that is used in many parts of the implementation. This is always the first block of the current epoch, which the node should be aware of once it has synced headers to the current point in the chain. A node performing state sync first makes a request (currently to centralized storage on GCS, but in the future to other nodes in the network) for a `ShardStateSyncResponseHeader` corresponding to that `sync_hash` and the Shard ID of the shard it's interested in. Among other things, this header includes the last new chunk before `sync_hash` in the shard, and a `StateRootNode` with hash equal to that chunk's `prev_state_root` field. Then the node downloads (again from GCS, but in the future it'll be from other nodes) the nodes of the trie with that `StateRootNode` as its root. Afterwards, it applies new chunks in the shard until it's caught up.

As described above, the state we download is the state in the shard after applying the second to last new chunk before `sync_hash`, which belongs to the previous epoch (since `sync_hash` is the first block of the new epoch). To move the point in the chain of the initial state download to the current epoch, we could either move the `sync_hash` forward or we could change the state sync protocol (perhaps changing the meaning of the `sync_hash` and the fields of the `ShardStateSyncResponseHeader`, or somehow changing these structures more significantly). The former is an easier first implementation, since it would not require any changes to the state sync protocol other than to the expected `sync_hash`. We would just need to move the `sync_hash` to a point far enough along in the chain so that the `StateRootNode` in the `ShardStateSyncResponseHeader` refers to state in the current epoch.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you clarify which one do we want to do?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: NEW
Development

Successfully merging this pull request may close these issues.

2 participants