-
Notifications
You must be signed in to change notification settings - Fork 25
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Calculate and compare CRC when writing and reading ledger snapshots (#…
…1319) Fixes #892 Integration into `cardano-node`: IntersectMBO/cardano-node#6047. This uses the branch [geo2a/issue-892-checksum-snaphot-file-release-ouroboros-consensus-0.21.0.0-backport](https://github.com/IntersectMBO/ouroboros-consensus/tree/geo2a/issue-892-checksum-snaphot-file-release-ouroboros-consensus-0.21.0.0-backport) which is the backport of this PR onto the most resent release of the `ouroboros-consensus` package. In this PR, we change the reading and writing disk snapshots of ledger state. When a snapshot is taken and written to disk, an additional file with the `.checksum` extension is written alongside it. The checksum file contains a string that represent the CRC32 checksum of the snapshot. The checksum is calculated incrementally, alongside writing the snapshot to disk. When a snapshot is read from dist, the checksum is again calculated and compared to the tracked one. If the checksum is different, `readSnaphot` returns the `ReadSnapshotDataCorruption` error, indicating data corruption. The checksum is calculated incrementally, alongside reading a writing the data. On write, we use the [`hPutAllCRC`](https://input-output-hk.github.io/fs-sim/fs-api/src/System.FS.CRC.html#hPutAllCRC) function from `fs-sim`, and on read we modify the [readIncremental](https://github.com/IntersectMBO/ouroboros-consensus/blob/892-checksum-snaphot-file/ouroboros-consensus/src/ouroboros-consensus/Ouroboros/Consensus/Util/CBOR.hs#L191) function to compute the checksum as data is read. To enable seamless integration into `cardano-node`, we make the check optional. When initialising the ledger state from a snapshot in `initLedgerDB`, we issue a warning in case the checksum file is missing for a snapshot, but do not fail as in case of invalid snapshots. The `db-analyser` tool ignores the checksum files by default when reading the snapshots. We add `--disk-snapshot-checksum` flag to enabled the check. When writing a snapshot to disk, e.g. as a result of the `--store-ledger` analysis, `db-analyser` will always write calculate the checksum and write it into the snapshot's `.checksum` file. **Tests** There state machine test in `Test.Ouroboros.Storage.LedgerDB.OnDisk` is relevant to this feature, and has caught a number of silly mistakes in the process of its implementation, for example forgetting to delete a checksum file when the snapshot is deleted. The model in the test does not track checksums, and I do not think it can (or should) be augmented to do that. Howerver, the `Snap` and `Restore` events are now parameterised by the checksum flag, and the values for the flag are randomised when generating these events. This leads to testing the following properties: - this feature is backwards-compatible, i.e. the `Restore` events will always lead to restoring from a snapshot, even if `Snap` events do not write checksum files (i.e. their flag is `NoDoDiskSnapshotChecksum` ~= `False`). - If the interpretation of the `DoDiskSnapshotChecksum` flag changes in the code base and becomes strict, i.e. hard fail if the checksum file is missing, this test will discover that. **Effects on Performance**: Running `db-analyser` to read a ledger snapshot and store the snapshot of the state at the next slot shows a difference of 2 seconds on my machine. See a comment below for the logs. To precisely evaluate the effects, we need a micro-benchmark of the reading and writing of snapshots with and without the checksum calculation.
- Loading branch information
Showing
19 changed files
with
324 additions
and
104 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
14 changes: 14 additions & 0 deletions
14
...sensus/changelog.d/20241128_084625_georgy.lukyanov_892_checksum_snaphot_file.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,14 @@ | ||
### Breaking | ||
|
||
- When writing a ledger state snapshot to disk, calculate the state's CRC32 checksum and write it to a separate file, which is named the same as the snapshot file, plus the `.checksum` extension. | ||
- When reading a snapshot file in `readSnapshot`, calculate its checksum and compare it to the value in the corresponding `.checksum` file. Return an error if the checksum is different or invalid. Issue a warning if the checksum file does not exist, but still initialise the ledger DB. | ||
- To support the previous item, change the error type of the `readSnapshot` from `ReadIncrementalErr` to the extended `ReadSnaphotErr`. | ||
- Checksumming the snapshots is controlled via the `doChecksum :: Flag "DoDiskSnapshotChecksum"` parameter of `initFromSnapshot`. Ultimately, this parameter comes from the Node's configuration file via the `DiskPolicy` data type. | ||
- Extend the `DiskPolicyArgs` data type to enable the node to pass `Flag "DoDiskSnapshotChecksum"` to Consensus. | ||
|
||
### Non-breaking | ||
|
||
- Make `Ouroboros.Consensus.Util.CBOR.readIncremental` optionally compute the checksum of the data as it is read. | ||
- Introduce an explicit `Ord` instance for `DiskSnapshot` that compares the values on `dsNumber`. | ||
- Introduce a new utility newtype `Flag` to represent type-safe boolean flags. See ouroboros-consensus/src/ouroboros-consensus/Ouroboros/Consensus/Util.hs. | ||
- Use `Flag "DoDiskSnapshotChecksum"` to control the check of the snapshot checksum file in `takeSnapshot`, `readSnapshot` and `writeSnapshot`. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.