From 77da4df5c11817509f832ea2d89456b0b7a7c277 Mon Sep 17 00:00:00 2001 From: Karl Nilsson Date: Thu, 7 Nov 2024 08:29:18 +0000 Subject: [PATCH] docs --- docs/internals/LOG_V2.md | 85 +++++++++++++--------------------------- src/ra.erl | 2 +- 2 files changed, 28 insertions(+), 59 deletions(-) diff --git a/docs/internals/LOG_V2.md b/docs/internals/LOG_V2.md index 9d78f94f..20bc34ca 100644 --- a/docs/internals/LOG_V2.md +++ b/docs/internals/LOG_V2.md @@ -21,14 +21,19 @@ sequenceDiagram In the Ra log v2 implementation some work previously done by the `ra_log_wal` process has now been either factored out or moved elsewhere. -In v1 the WAL process would be responsible for for both writing a disk and -to mem tables. Each writer (designated by a locally scoped binary "UId") would -have a unique ETS table to cover the lifetime of each WAL file. Once the WAL has -filled and the segment writer process has flushed the mem tables to segmetn files -on disk the whole table would have been deleted. +In Ra log v1 the WAL process would be responsible for both writing to disk and +to memtables (ETS). Each writer (identified by a locally scoped binary "UId") would +have a unique ETS table to cover the lifetime of each WAL file. Once the WAL breaches +its' configured `max_wal_size_bytes` limit it closes the file and hands it over to +the segment writer to flush any still live entries to per-server segments. +The segment writer reads each entry from the memtables, not the WAL file. +When all still live entries in the WAL have been flushed to segments the segment +writer deletes the WAL file and notifies all relevant ra servers of the new +segments. Once each ra server receives this notifications and updates their +"seg-refs" they delete the whole memtable. In the v2 implementation the WAL no longer writes to memtables during normal -operation (exception being the recovery phase). Instead the mem tables are +operation (exception being the recovery phase). Instead the memtables are written to by the Ra servers before the write request is sent to the WAL. The removes the need for a separate ETS table per Ra server "cache" which was required in the v1 implementation. @@ -42,34 +47,34 @@ and have some written but uncommitted entries that another leader in a higher term has overwritten. -## Memory tables (mem-tables) +## In-Memory Tables (memtables) Module: `ra_mt` Mem tables are owned and created by the `ra_log_ets` process. Ra servers call -into the process to create new mem-tables and a registry of current tables is +into the process to create new memtables and a registry of current tables is kept in the `ra_log_open_memtables` table. From v2 the `ra_log_closed_memtables` ETS table is no longer used or created. -Entries can be written or deleted but never overwritten. +Invariant: Entries can be written or deleted but never overwritten. -During normal operation each Ra server only writes to a single ETS mem-table. -Entries that are no longer required to be kept in the mem table due to snapshotting +During normal operation each Ra server only writes to a single ETS memtable. +Entries that are no longer required to be kept in the memtable due to snapshotting or having been written to disk segments are deleted. The actual delete operation is performed by `ra_log_ets` on request by Ra servers. -Mem tables are thus no longer linked to the lifetime of a given WAL file as before. -Apart from recover after a system restart only the Ra servers write to -mem-tables thus lightening the workload of the WAL process. +Memtables are no longer linked to the lifetime of a given WAL file as before. +Apart from recovery after a system restart only the Ra servers write to +memtables which reduces the workload of the WAL process. -New mem-tables are only created when a server needs to overwrite indexes in it's +New memtables are only created when a server needs to overwrite indexes in its log. This typically only happens when a leader has been replaced and steps down to follower with uncommitted entries in it's log. Due to the async nature of the -Ra log implementation it is not safe to ever overwrite an entry in a mem-table -(as concurrent reads may be done by the segment writer process).Therefore a new -mem-table needs to be created when this situation occurs. +Ra log implementation it is not safe to ever overwrite an entry in a memtable +(as concurrent reads may be done by the segment writer process). Therefore a new +memtable needs to be created when this situation occurs. -When a new mem-table is created the old ones will not be written to any further +When a new memtable is created the old ones will not be written to any further and will be deleted as soon as they are emptied. ## WAL @@ -78,11 +83,11 @@ Module: `ra_log_wal` The `ra_log_wal` process now has the following responsibilities: -* Write entries to disk and notify the writer processes when their entries +* Write entries to disk and notifies the writer processes when their entries have been synced to the underlying storage. * Track the ranges written by each writer (ra server) for which ETS table and -notify the segment writer when a WAL file has filled up. -* Recover mem-tables from WAL files after a system restart. +notifies the segment writer when a WAL file has filled up. +* Recover memtables from WAL files after a system restart. ## Segment Writer @@ -102,39 +107,3 @@ redundant entries to disk. The latest snapshot index for each Ra server is kept in the `ra_log_snapshot_state` ETS table. - -## Diagrams - - -```mermaid -sequenceDiagram - participant ra-server-n - participant wal - participant segment-writer - - loop until wal full - ra-server-n->>+wal: write(Index=1..N, Term=T) - wal->>wal: write-batch([1] - wal->>-ra-server-n: written event: Term=T, Range=(1, N) - end - wal->>+segment-writer: flush-wal-ranges - segment-writer-->segment-writer: flush to segment files - segment-writer->>ra-server: notify flushed segments - ra-server-n-->ra-server-n: update mem-table-ranges - ra-server-n->>ets-server: delete range from mem-table -``` - -```mermaid -flowchart TD - WAL - RaServer1 - RaServer2 - RaServer1 -- Write (55, 57) --> WAL - RaServer2 -- Write (99, 98) --> WAL - WAL -- Written (55, 57) --> RaServer1 - WAL -- Written (99, 98) --> RaServer2 - WAL -- (RaServer1, [(55, 57)].. --> SegmentWriter - SegmentWriter -- New segments --> RaServer1 - SegmentWriter -- New segments --> RaServer2 -``` - diff --git a/src/ra.erl b/src/ra.erl index 7ec85673..2fe35e5a 100644 --- a/src/ra.erl +++ b/src/ra.erl @@ -743,7 +743,7 @@ new_uid(Source) when is_binary(Source) -> %% @doc Returns a map of overview data of the default Ra system on the current Erlang %% node. -%% DEPRECATED: user overview/1 +%% DEPRECATED: use overview/1 %% @end -spec overview() -> map() | system_not_started. overview() ->