Skip to content

Commit

Permalink
feat: stabilize flat storage for reads (near#8761)
Browse files Browse the repository at this point in the history
The code is literally removing `protocol_feature_flat_state` and moving feature to stable protocol. We also disable `test_state_sync` as this is part of refactor we can do in Q2.

## Feature to stabilize

Here we stabilize Flat Storage for reads, which means that all state reads in the client during block processing will query flat storage instead of Trie. Flat Storage is another index for blockchain state, reducing number of DB accesses for state read from `2 * key.len()` in the worst case to 2.

This will trigger background creation of flat storage, using 8 threads and finishing in 15h for RPC node and 2d for archival node. After that all non-contract reads will go through flat storage, for which special "chunk views" will be created. When protocol upgrade happens, contracts reads will go through flat storage as well. Also compute costs will change as Option 3 suggests [here](near#8006 (comment)). It is to be merged separately, but we need to ensure that both costs change and flat storage go into next release together.

## Context

Find more details in:
- Overview: https://near.github.io/nearcore/architecture/storage/flat_storage.html
- Approved NEP: https://github.com/near/NEPs/blob/master/neps/nep-0339.md
- Tracking issue: near#7327

## Testing and QA

* Flat storage structs are covered by unit tests;
* Integration tests check that chain behaviour is preserved and costs are changed as expected;
* Flat storage spent ~2 months in betanet with assertion that flat and trie `ValueRef`s are the same;
* We were running testnet/mainnet nodes for ~2 months with the same assertion. We checked that performance is not degraded, see e.g. https://nearinc.grafana.net/d/Vg9SREA4k/flat-storage-test?orgId=1&var-chain_id=mainnet&var-node_id=logunov-mainnet-fs-1&from=1677804289279&to=1678088806154 checking that even with finality lag of 50 blocks performance is not impacted. Small exception is that we updated data layout several times during development, but we checked that results are unchanged.

## Checklist
- [x] Include compute costs after they are merged - near#8924
- [x] https://nayduck.near.org/#/run/2916
- [x] Update CHANGELOG.md to include this protocol feature in the `Unreleased` section.
  • Loading branch information
Longarithm authored Apr 21, 2023
1 parent c9231fa commit e583d43
Show file tree
Hide file tree
Showing 40 changed files with 151 additions and 295 deletions.
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@
* Contract preparation and gas charging for wasm execution also switched to using our own code, as per the finite-wasm specification. Contract execution gas costs will change slightly for expected use cases. This opens up opportunities for further changing the execution gas costs (eg. with different costs per opcode) to lower contract execution cost long-term.
* Compute Costs are implemented and stabilized. Compute usage of the chunk is now limited according to the compute costs. [#8915](https://github.com/near/nearcore/pull/8915), [NEP-455](https://github.com/near/NEPs/blob/master/neps/nep-0455.md).
* Write related storage compute costs are increased which means they fill a chunk sooner but gas costs are unaffected. [#8924](https://github.com/near/nearcore/pull/8924)
* Flat Storage for reads, reducing number of DB accesses for state read from `2 * key.len()` in the worst case to 2. [#8761](https://github.com/near/nearcore/pull/8761), [NEP-399](https://github.com/near/NEPs/pull/399)

### Non-protocol Changes

Expand Down
2 changes: 0 additions & 2 deletions chain/chain/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -48,7 +48,6 @@ expensive_tests = []
test_features = []
delay_detector = ["delay-detector/delay_detector"]
no_cache = ["near-store/no_cache"]
protocol_feature_flat_state = ["near-store/protocol_feature_flat_state"]
protocol_feature_reject_blocks_with_outdated_protocol_version = ["near-primitives/protocol_feature_reject_blocks_with_outdated_protocol_version"]

nightly = [
Expand All @@ -58,7 +57,6 @@ nightly_protocol = [
"near-store/nightly_protocol",
"near-primitives/nightly_protocol",
"protocol_feature_reject_blocks_with_outdated_protocol_version",
"protocol_feature_flat_state",
]
mock_node = []
sandbox = ["near-primitives/sandbox"]
118 changes: 57 additions & 61 deletions chain/chain/src/chain.rs
Original file line number Diff line number Diff line change
Expand Up @@ -630,14 +630,12 @@ impl Chain {
// Set the root block of flat state to be the genesis block. Later, when we
// init FlatStorages, we will read the from this column in storage, so it
// must be set here.
if cfg!(feature = "protocol_feature_flat_state") {
let tmp_store_update = runtime_adapter.set_flat_storage_for_genesis(
genesis.hash(),
genesis.header().height(),
genesis.header().epoch_id(),
)?;
store_update.merge(tmp_store_update);
}
let tmp_store_update = runtime_adapter.set_flat_storage_for_genesis(
genesis.hash(),
genesis.header().height(),
genesis.header().epoch_id(),
)?;
store_update.merge(tmp_store_update);

info!(target: "chain", "Init: saved genesis: #{} {} / {:?}", block_head.height, block_head.last_block_hash, state_roots);

Expand Down Expand Up @@ -3197,37 +3195,37 @@ impl Chain {

// We synced shard state on top of _previous_ block for chunk in shard state header and applied state parts to
// flat storage. Now we can set flat head to hash of this block and create flat storage.
// TODO (#7327): ensure that no flat storage work is done for `KeyValueRuntime`.
if cfg!(feature = "protocol_feature_flat_state") {
// If block_hash is equal to default - this means that we're all the way back at genesis.
// So we don't have to add the storage state for shard in such case.
// TODO(8438) - add additional test scenarios for this case.
if *block_hash != CryptoHash::default() {
let block_header = self.get_block_header(block_hash)?;
let epoch_id = block_header.epoch_id();
let shard_uid = self.runtime_adapter.shard_id_to_uid(shard_id, epoch_id)?;
if !matches!(
self.runtime_adapter.get_flat_storage_status(shard_uid),
FlatStorageStatus::Disabled
) {
// Flat storage must not exist at this point because leftover keys corrupt its state.
assert!(self.runtime_adapter.get_flat_storage_for_shard(shard_uid).is_none());
// If block_hash is equal to default - this means that we're all the way back at genesis.
// So we don't have to add the storage state for shard in such case.
// TODO(8438) - add additional test scenarios for this case.
if *block_hash != CryptoHash::default() {
let block_header = self.get_block_header(block_hash)?;
let epoch_id = block_header.epoch_id();
let shard_uid = self.runtime_adapter.shard_id_to_uid(shard_id, epoch_id)?;

// Check if flat storage is disabled, which may be the case when runtime is implemented with
// `KeyValueRuntime`.
if !matches!(
self.runtime_adapter.get_flat_storage_status(shard_uid),
FlatStorageStatus::Disabled
) {
// Flat storage must not exist at this point because leftover keys corrupt its state.
assert!(self.runtime_adapter.get_flat_storage_for_shard(shard_uid).is_none());

let mut store_update = self.runtime_adapter.store().store_update();
store_helper::set_flat_storage_status(
&mut store_update,
shard_uid,
FlatStorageStatus::Ready(FlatStorageReadyStatus {
flat_head: near_store::flat::BlockInfo {
hash: *block_hash,
prev_hash: *block_header.prev_hash(),
height: block_header.height(),
},
}),
);
store_update.commit()?;
self.runtime_adapter.create_flat_storage_for_shard(shard_uid);
}
let mut store_update = self.runtime_adapter.store().store_update();
store_helper::set_flat_storage_status(
&mut store_update,
shard_uid,
FlatStorageStatus::Ready(FlatStorageReadyStatus {
flat_head: near_store::flat::BlockInfo {
hash: *block_hash,
prev_hash: *block_header.prev_hash(),
height: block_header.height(),
},
}),
);
store_update.commit()?;
self.runtime_adapter.create_flat_storage_for_shard(shard_uid);
}
}

Expand Down Expand Up @@ -4942,31 +4940,29 @@ impl<'a> ChainUpdate<'a> {
shard_uid: ShardUId,
trie_changes: &WrappedTrieChanges,
) -> Result<(), Error> {
if cfg!(feature = "protocol_feature_flat_state") {
let delta = FlatStateDelta {
changes: FlatStateChanges::from_state_changes(&trie_changes.state_changes()),
metadata: FlatStateDeltaMetadata {
block: near_store::flat::BlockInfo { hash: block_hash, height, prev_hash },
},
};
let delta = FlatStateDelta {
changes: FlatStateChanges::from_state_changes(&trie_changes.state_changes()),
metadata: FlatStateDeltaMetadata {
block: near_store::flat::BlockInfo { hash: block_hash, height, prev_hash },
},
};

if let Some(chain_flat_storage) =
self.runtime_adapter.get_flat_storage_for_shard(shard_uid)
{
// If flat storage exists, we add a block to it.
let store_update =
chain_flat_storage.add_delta(delta).map_err(|e| StorageError::from(e))?;
self.chain_store_update.merge(store_update);
} else {
let shard_id = shard_uid.shard_id();
// Otherwise, save delta to disk so it will be used for flat storage creation later.
info!(target: "chain", %shard_id, "Add delta for flat storage creation");
let mut store_update = self.chain_store_update.store().store_update();
store_helper::set_delta(&mut store_update, shard_uid, &delta)
.map_err(|e| StorageError::from(e))?;
self.chain_store_update.merge(store_update);
}
if let Some(chain_flat_storage) = self.runtime_adapter.get_flat_storage_for_shard(shard_uid)
{
// If flat storage exists, we add a block to it.
let store_update =
chain_flat_storage.add_delta(delta).map_err(|e| StorageError::from(e))?;
self.chain_store_update.merge(store_update);
} else {
let shard_id = shard_uid.shard_id();
// Otherwise, save delta to disk so it will be used for flat storage creation later.
info!(target: "chain", %shard_id, "Add delta for flat storage creation");
let mut store_update = self.chain_store_update.store().store_update();
store_helper::set_delta(&mut store_update, shard_uid, &delta)
.map_err(|e| StorageError::from(e))?;
self.chain_store_update.merge(store_update);
}

Ok(())
}

Expand Down
7 changes: 2 additions & 5 deletions chain/chain/src/store.rs
Original file line number Diff line number Diff line change
Expand Up @@ -2568,11 +2568,8 @@ impl<'a> ChainStoreUpdate<'a> {
| DBCol::_TransactionRefCount
| DBCol::_TransactionResult
| DBCol::StateChangesForSplitStates
| DBCol::CachedContractCode => {
unreachable!();
}
#[cfg(feature = "protocol_feature_flat_state")]
DBCol::FlatState
| DBCol::CachedContractCode
| DBCol::FlatState
| DBCol::FlatStateChanges
| DBCol::FlatStateDeltaMetadata
| DBCol::FlatStorageStatus => {
Expand Down
2 changes: 0 additions & 2 deletions chain/client/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -68,11 +68,9 @@ delay_detector = [
nightly_protocol = []
nightly = [
"nightly_protocol",
"protocol_feature_flat_state",
"near-chain/nightly",
]
sandbox = [
"near-client-primitives/sandbox",
"near-chain/sandbox",
]
protocol_feature_flat_state = ["near-store/protocol_feature_flat_state", "near-chain/protocol_feature_flat_state"]
2 changes: 1 addition & 1 deletion chain/jsonrpc/jsonrpc-tests/res/genesis_config.json
Original file line number Diff line number Diff line change
Expand Up @@ -69,4 +69,4 @@
],
"use_production_config": false,
"records": []
}
}
2 changes: 0 additions & 2 deletions core/primitives/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -49,13 +49,11 @@ dump_errors_schema = ["near-rpc-error-macro/dump_errors_schema"]
protocol_feature_fix_staking_threshold = []
protocol_feature_fix_contract_loading_cost = []
protocol_feature_reject_blocks_with_outdated_protocol_version = []
protocol_feature_flat_state = []
nightly = [
"nightly_protocol",
"protocol_feature_fix_staking_threshold",
"protocol_feature_fix_contract_loading_cost",
"protocol_feature_reject_blocks_with_outdated_protocol_version",
"protocol_feature_flat_state",
]

nightly_protocol = []
Expand Down
17 changes: 9 additions & 8 deletions core/primitives/src/version.rs
Original file line number Diff line number Diff line change
Expand Up @@ -144,12 +144,17 @@ pub enum ProtocolFeature {
///
/// Meta Transaction NEP-366: https://github.com/near/NEPs/blob/master/neps/nep-0366.md
DelegateAction,

Ed25519Verify,
/// Decouple compute and gas costs of operations to safely limit the compute time it takes to
/// process the chunk.
///
/// Compute Costs NEP-455: https://github.com/near/NEPs/blob/master/neps/nep-0455.md
ComputeCosts,
/// Enable flat storage for reads, reducing number of DB accesses from `2 * key.len()` in
/// the worst case to 2.
///
/// Flat Storage NEP-399: https://github.com/near/NEPs/blob/master/neps/nep-0399.md
FlatStorageReads,

/// In case not all validator seats are occupied our algorithm provide incorrect minimal seat
/// price - it reports as alpha * sum_stake instead of alpha * sum_stake / (1 - alpha), where
Expand All @@ -159,11 +164,8 @@ pub enum ProtocolFeature {
/// Charge for contract loading before it happens.
#[cfg(feature = "protocol_feature_fix_contract_loading_cost")]
FixContractLoadingCost,
Ed25519Verify,
#[cfg(feature = "protocol_feature_reject_blocks_with_outdated_protocol_version")]
RejectBlocksWithOutdatedProtocolVersions,
#[cfg(feature = "protocol_feature_flat_state")]
FlatStorageReads,
}

/// Both, outgoing and incoming tcp connections to peers, will be rejected if `peer's`
Expand Down Expand Up @@ -248,8 +250,9 @@ impl ProtocolFeature {
ProtocolFeature::Ed25519Verify
| ProtocolFeature::ZeroBalanceAccount
| ProtocolFeature::DelegateAction => 59,
ProtocolFeature::ComputeCosts => 61,
ProtocolFeature::NearVm => 61,
ProtocolFeature::ComputeCosts
| ProtocolFeature::NearVm
| ProtocolFeature::FlatStorageReads => 61,

// Nightly features
#[cfg(feature = "protocol_feature_fix_staking_threshold")]
Expand All @@ -258,8 +261,6 @@ impl ProtocolFeature {
ProtocolFeature::FixContractLoadingCost => 129,
#[cfg(feature = "protocol_feature_reject_blocks_with_outdated_protocol_version")]
ProtocolFeature::RejectBlocksWithOutdatedProtocolVersions => 132,
#[cfg(feature = "protocol_feature_flat_state")]
ProtocolFeature::FlatStorageReads => 135,
}
}
}
Expand Down
2 changes: 0 additions & 2 deletions core/store/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -60,11 +60,9 @@ io_trace = []
no_cache = []
single_thread_rocksdb = [] # Deactivate RocksDB IO background threads
test_features = []
protocol_feature_flat_state = ["near-primitives/protocol_feature_flat_state"]
serialize_all_state_changes = []

nightly_protocol = []
nightly = [
"nightly_protocol",
"protocol_feature_flat_state",
]
17 changes: 4 additions & 13 deletions core/store/src/columns.rs
Original file line number Diff line number Diff line change
Expand Up @@ -260,22 +260,18 @@ pub enum DBCol {
/// Flat state contents. Used to get `ValueRef` by trie key faster than doing a trie lookup.
/// - *Rows*: `shard_uid` + trie key (Vec<u8>)
/// - *Column type*: ValueRef
#[cfg(feature = "protocol_feature_flat_state")]
FlatState,
/// Changes for flat state delta. Stores how flat state should be updated for the given shard and block.
/// - *Rows*: `KeyForFlatStateDelta { shard_uid, block_hash }`
/// - *Column type*: `FlatStateChanges`
#[cfg(feature = "protocol_feature_flat_state")]
FlatStateChanges,
/// Metadata for flat state delta.
/// - *Rows*: `KeyForFlatStateDelta { shard_uid, block_hash }`
/// - *Column type*: `FlatStateDeltaMetadata`
#[cfg(feature = "protocol_feature_flat_state")]
FlatStateDeltaMetadata,
/// Flat storage status for the corresponding shard.
/// - *Rows*: `shard_uid`
/// - *Column type*: `FlatStorageStatus`
#[cfg(feature = "protocol_feature_flat_state")]
FlatStorageStatus,
}

Expand Down Expand Up @@ -474,9 +470,8 @@ impl DBCol {
| DBCol::_TransactionRefCount
| DBCol::_TransactionResult
// | DBCol::StateChangesForSplitStates
| DBCol::CachedContractCode => false,
#[cfg(feature = "protocol_feature_flat_state")]
DBCol::FlatState
| DBCol::CachedContractCode
| DBCol::FlatState
| DBCol::FlatStateChanges
| DBCol::FlatStateDeltaMetadata
| DBCol::FlatStorageStatus => false,
Expand Down Expand Up @@ -543,13 +538,9 @@ impl DBCol {
DBCol::HeaderHashesByHeight => &[DBKeyType::BlockHeight],
DBCol::StateChangesForSplitStates => &[DBKeyType::BlockHash, DBKeyType::ShardId],
DBCol::TransactionResultForBlock => &[DBKeyType::OutcomeId, DBKeyType::BlockHash],
#[cfg(feature = "protocol_feature_flat_state")]
DBCol::FlatState => &[DBKeyType::ShardUId, DBKeyType::TrieKey],
#[cfg(feature = "protocol_feature_flat_state")]
DBCol::FlatStateChanges => &[DBKeyType::ShardId, DBKeyType::BlockHash],
#[cfg(feature = "protocol_feature_flat_state")]
DBCol::FlatStateDeltaMetadata => &[DBKeyType::ShardId, DBKeyType::BlockHash],
#[cfg(feature = "protocol_feature_flat_state")]
DBCol::FlatStateChanges => &[DBKeyType::ShardUId, DBKeyType::BlockHash],
DBCol::FlatStateDeltaMetadata => &[DBKeyType::ShardUId, DBKeyType::BlockHash],
DBCol::FlatStorageStatus => &[DBKeyType::ShardUId],
}
}
Expand Down
1 change: 0 additions & 1 deletion core/store/src/config.rs
Original file line number Diff line number Diff line change
Expand Up @@ -154,7 +154,6 @@ impl StoreConfig {
pub const fn col_cache_size(&self, col: crate::DBCol) -> bytesize::ByteSize {
match col {
crate::DBCol::State => self.col_state_cache_size,
#[cfg(feature = "protocol_feature_flat_state")]
crate::DBCol::FlatState => self.col_state_cache_size,
_ => bytesize::ByteSize::mib(32),
}
Expand Down
9 changes: 1 addition & 8 deletions core/store/src/flat/manager.rs
Original file line number Diff line number Diff line change
Expand Up @@ -39,10 +39,6 @@ impl FlatStorageManager {
}

pub fn test(store: Store, shard_uids: &[ShardUId], flat_head: CryptoHash) -> Self {
if !cfg!(feature = "protocol_feature_flat_state") {
return Self::new(store);
}

let mut flat_storages = HashMap::default();
for shard_uid in shard_uids {
let mut store_update = store.store_update();
Expand Down Expand Up @@ -138,10 +134,7 @@ impl FlatStorageManager {
}
}

// TODO (#7327): change the function signature to Result<FlatStorage, Error> when
// we stabilize feature protocol_feature_flat_state. We use option now to return None when
// the feature is not enabled. Ideally, it should return an error because it is problematic
// if the flat storage state does not exist
// TODO (#7327): consider returning Result<FlatStorage, Error> when we expect flat storage to exist
pub fn get_flat_storage_for_shard(&self, shard_uid: ShardUId) -> Option<FlatStorage> {
let flat_storages = self.0.flat_storages.lock().expect(POISONED_LOCK_ERR);
flat_storages.get(&shard_uid).cloned()
Expand Down
1 change: 0 additions & 1 deletion core/store/src/flat/storage.rs
Original file line number Diff line number Diff line change
Expand Up @@ -326,7 +326,6 @@ impl FlatStorage {
}
}

#[cfg(feature = "protocol_feature_flat_state")]
#[cfg(test)]
mod tests {
use crate::flat::delta::{FlatStateChanges, FlatStateDelta, FlatStateDeltaMetadata};
Expand Down
Loading

0 comments on commit e583d43

Please sign in to comment.