diff --git a/neps/assets/nep-0568/NEP-HybridMemTrie.png b/neps/assets/nep-0568/NEP-HybridMemTrie.png new file mode 100644 index 000000000..c18836f81 Binary files /dev/null and b/neps/assets/nep-0568/NEP-HybridMemTrie.png differ diff --git a/neps/nep-0568.md b/neps/nep-0568.md index ebe02d9eb..9018649dd 100644 --- a/neps/nep-0568.md +++ b/neps/nep-0568.md @@ -193,6 +193,24 @@ The section should return to the examples given in the previous section, and exp ``` ### State Storage - MemTrie +The current implementation of MemTrie uses a pool of memory (`STArena`) to allocate and deallocate nodes and internal pointers in this pool to reference child nodes. MemTries, unlike the State representation of Trie, do not work with the hash of the nodes but internal memory pointers directly. Additionally, MemTries are not thread safe and one MemTrie exists per shard. + +As described in [MemTrie](#state-storage---memtrie) section above, we need an efficient way to split the MemTrie into two child MemTries within a span of 1 block. What makes this challenging is that the current implementation of MemTrie is not thread safe and can not be shared across two shards. + +The naive way to create two MemTries for the child shards would be to iterate through all the entries of the parent MemTrie and fill in these values into the child MemTries. This however is prohibitively time consuming. + +The solution to this problem was to introduce the concept of Frozen MemTrie (with a `FrozenArena`) which is a cloneable, read-only, thread-safe snapshot of a MemTrie. We can call the `freeze` method on an existing MemTrie that converts it into a Frozen MemTrie. Note that this process consumes the original MemTrie and we can no longer allocate and deallocate nodes to it. + +Along with `FrozenArena`, we also introduce a `HybridArena` which is effectively a base made of `FrozenArena` with a top layer of `STArena` where we support allocating and deallocating new nodes into the MemTrie. Newly allocated nodes can reference/point to nodes in the `FrozenArena`. We use this Hybrid MemTrie as a temporary MemTrie while the flat storage is being constructed in the background. + +During a resharding event, at the boundary of the epoch, when we need to split the parent shard into the two child shards, we do the following steps: +1. Freeze the parent MemTrie arena to create a read-only frozen arena that represents a snapshot of the state as of the time of freezing, i.e. after postprocessing last block of epoch. Note that we no longer require the parent MemTrie in runtime going forward. +2. We cheaply clone the Frozen MemTrie for both the child MemTries to use. Note that this doesn't clone the parent arena memory, but just increases the refcount. +3. We then create a new MemTrie with HybridArena for each of the children. The base of the MemTrie is the read-only FrozenArena while all new node allocations happens on a dedicated STArena memory pool for each child MemTrie. This is the temporary MemTrie that we use while Flat Storage is being built in the background. +4. Once the Flat Storage is constructed in the post processing step of resharding, we use that to load a new MemTrie and discard the Hybrid MemTrie. + +![Hybrid MemTrie diagram](assets/nep-0568/NEP-HybridMemTrie.png) + ### State Storage - State mapping To enable efficient shard state management during resharding, Resharding V3 uses the `DBCol::ShardUIdMapping` column.