-
Notifications
You must be signed in to change notification settings - Fork 576
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BEP-283: Segmented History Data Maintenance #283
Open
Mercybudda
wants to merge
4
commits into
bnb-chain:master
Choose a base branch
from
Mercybudda:bep283_history_maintain
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Mercybudda
force-pushed
the
bep283_history_maintain
branch
from
September 11, 2023 09:07
2ab7d45
to
3802eab
Compare
Compare the changes and pull request |
Nouri11190
approved these changes
Oct 6, 2023
Closed
Aces42020
approved these changes
May 19, 2024
kiko1842
approved these changes
May 27, 2024
kiko1842
approved these changes
May 27, 2024
octavio12345300
approved these changes
Oct 24, 2024
octavio12345300
approved these changes
Oct 26, 2024
Magkoooh
approved these changes
Nov 25, 2024
TheManager73
approved these changes
Jan 1, 2025
octavio12345300
approved these changes
Jan 3, 2025
octavio12345300
approved these changes
Jan 3, 2025
octavio12345300
approved these changes
Jan 3, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
BEP-283: Segmented History Block Maintenance
1. Summary
This BEP proposes a practical solution to address the issue of increasing history data storage on the BSC(BNB Smart Chain) for full nodes, so they only need to maintain a limited range of blocks. With the introduction of a new synchronization protocol and storage protocol, nodes can synchronize from a specific checkpoint and skip historical blocks. This eliminates the mandatory requirement to keep historical blocks.
2. Motivation
The history block data includes block header, body and receipt, they are useless to execute the latest blocks. A recent storage profile on Jul-02-2023 shows that the size of history block has reached ~1288 GB, which makes it a big burden to run a BSC full node. As BSC keeps generating new blocks, the history block data also keeps growing.
Actually the history block data can be maintained by some archive node or by a DA layer, a full node does not need to keep a full copy of the history block. But simply deleting the history block data is also not acceptable, it will make the P2P network in chaos and hard to sync.
We need to make a rule so that full nodes only need to keep a recent period of blocks, like several months, it would be better to keep the bounded history block data size within 200GB. It could reduce both node’s storage pressure and the network traffic cost.
3. Specification
3.1 General Workflow
The history block data will be divided into several segments, the 1st segment is segment_0, which is from genesis to BoundStartBlock-1, then following segments will have the same length: HistorySegmentLength.
The BSC node only needs to maintain the latest 2 segments, in case of block reorg, the current segment number is calculated based on the “finalized” block.
3.2 Prune
The offline block prune would need to be boundary aligned and leave the most recent 2 segments. The most recent segment is determined by the finalized block, since FastFinality is to be enabled, we can take use of the feature to determine the finalized block and then get the current segment index.
3.2.1 Prune details
Since we introduced segment definition, the historical block data will be organized on a segment-by-segment basis within the file directory, making it easy to prune the block data by deleting the corresponding file folder.
3.3 Node Sync
It would be difficult to sync from genesis, since most nodes may choose to not preserve these old blocks, but it is still possible as long as some nodes still keep the whole history data.
There could have 2 approaches to do sync after this proposal
directly download a snapshot of a recent state, from a snapshot service provider or DA layer like GreenField
Segment based snap sync: user can take the boundary blockhash as the new
GenesisHash
and start from it directly.3.4 Data Availability
Some of the nodes like archive nodes would keep maintaining the whole history data.
And meanwhile, could take use of a DA layer, like greenfield to make sure the whole block data is available.
4. Rationale
4.1 BoundStartBlock & HistorySegmentLength
As the current history block data is already very large, we prefer to enable this proposal faster, so
BoundStartBlock is still more than 1 month ahead, it could be an acceptable date.
HistorySegmentLength, we did a profile, for the past 6 months(Jan-2023 to July 2023), ~1.2GB history data was generated per day on average. But it is somehow low traffic during this period. Traffic volume could be 3 times if the bull market starts, that is ~3.6GB per day. To keep the historical data size within 500GB in the bull market and 200GB in the bear market, 90 days could be a reasonable value.
4.2 Why Segment The History Data
No Big fundamental differences to the EIP-4444 proposal
Pros: aligned
easier to determine the current living blocks?
easier to provide the snapshot download service?
5. Forward Compatibility
5.1 Portal Network
Portal network is a hot topic to solve the storage pressure, once it is ready, it is possible that portal network can replace this proposal if it has a more applicable solution.
6. Backward Compatibility
6.1 Archive Node
If you run an archive node, you can just keep all the history block data, no impact to its business.
6.2 RPC API
If users query history block data that have been pruned, could return a new error code to show it is expired and removed.
License
The content is licensed under the CC0 license.