Skip to content

Commit

Permalink
docs
Browse files Browse the repository at this point in the history
  • Loading branch information
kjnilsson committed Nov 7, 2024
1 parent daf0beb commit 77da4df
Show file tree
Hide file tree
Showing 2 changed files with 28 additions and 59 deletions.
85 changes: 27 additions & 58 deletions docs/internals/LOG_V2.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,14 +21,19 @@ sequenceDiagram
In the Ra log v2 implementation some work previously done by the `ra_log_wal`
process has now been either factored out or moved elsewhere.

In v1 the WAL process would be responsible for for both writing a disk and
to mem tables. Each writer (designated by a locally scoped binary "UId") would
have a unique ETS table to cover the lifetime of each WAL file. Once the WAL has
filled and the segment writer process has flushed the mem tables to segmetn files
on disk the whole table would have been deleted.
In Ra log v1 the WAL process would be responsible for both writing to disk and
to memtables (ETS). Each writer (identified by a locally scoped binary "UId") would
have a unique ETS table to cover the lifetime of each WAL file. Once the WAL breaches
its' configured `max_wal_size_bytes` limit it closes the file and hands it over to
the segment writer to flush any still live entries to per-server segments.
The segment writer reads each entry from the memtables, not the WAL file.
When all still live entries in the WAL have been flushed to segments the segment
writer deletes the WAL file and notifies all relevant ra servers of the new
segments. Once each ra server receives this notifications and updates their
"seg-refs" they delete the whole memtable.

In the v2 implementation the WAL no longer writes to memtables during normal
operation (exception being the recovery phase). Instead the mem tables are
operation (exception being the recovery phase). Instead the memtables are
written to by the Ra servers before the write request is sent to the WAL.
The removes the need for a separate ETS table per Ra server "cache" which was
required in the v1 implementation.
Expand All @@ -42,34 +47,34 @@ and have some written but uncommitted entries that another leader in a higher
term has overwritten.


## Memory tables (mem-tables)
## In-Memory Tables (memtables)

Module: `ra_mt`

Mem tables are owned and created by the `ra_log_ets` process. Ra servers call
into the process to create new mem-tables and a registry of current tables is
into the process to create new memtables and a registry of current tables is
kept in the `ra_log_open_memtables` table. From v2 the `ra_log_closed_memtables`
ETS table is no longer used or created.

Entries can be written or deleted but never overwritten.
Invariant: Entries can be written or deleted but never overwritten.

During normal operation each Ra server only writes to a single ETS mem-table.
Entries that are no longer required to be kept in the mem table due to snapshotting
During normal operation each Ra server only writes to a single ETS memtable.
Entries that are no longer required to be kept in the memtable due to snapshotting
or having been written to disk segments are deleted. The actual delete operation
is performed by `ra_log_ets` on request by Ra servers.

Mem tables are thus no longer linked to the lifetime of a given WAL file as before.
Apart from recover after a system restart only the Ra servers write to
mem-tables thus lightening the workload of the WAL process.
Memtables are no longer linked to the lifetime of a given WAL file as before.
Apart from recovery after a system restart only the Ra servers write to
memtables which reduces the workload of the WAL process.

New mem-tables are only created when a server needs to overwrite indexes in it's
New memtables are only created when a server needs to overwrite indexes in its
log. This typically only happens when a leader has been replaced and steps down
to follower with uncommitted entries in it's log. Due to the async nature of the
Ra log implementation it is not safe to ever overwrite an entry in a mem-table
(as concurrent reads may be done by the segment writer process).Therefore a new
mem-table needs to be created when this situation occurs.
Ra log implementation it is not safe to ever overwrite an entry in a memtable
(as concurrent reads may be done by the segment writer process). Therefore a new
memtable needs to be created when this situation occurs.

When a new mem-table is created the old ones will not be written to any further
When a new memtable is created the old ones will not be written to any further
and will be deleted as soon as they are emptied.

## WAL
Expand All @@ -78,11 +83,11 @@ Module: `ra_log_wal`

The `ra_log_wal` process now has the following responsibilities:

* Write entries to disk and notify the writer processes when their entries
* Write entries to disk and notifies the writer processes when their entries
have been synced to the underlying storage.
* Track the ranges written by each writer (ra server) for which ETS table and
notify the segment writer when a WAL file has filled up.
* Recover mem-tables from WAL files after a system restart.
notifies the segment writer when a WAL file has filled up.
* Recover memtables from WAL files after a system restart.

## Segment Writer

Expand All @@ -102,39 +107,3 @@ redundant entries to disk.
The latest snapshot index for each Ra server is kept in the `ra_log_snapshot_state`
ETS table.


## Diagrams


```mermaid
sequenceDiagram
participant ra-server-n
participant wal
participant segment-writer
loop until wal full
ra-server-n->>+wal: write(Index=1..N, Term=T)
wal->>wal: write-batch([1]
wal->>-ra-server-n: written event: Term=T, Range=(1, N)
end
wal->>+segment-writer: flush-wal-ranges
segment-writer-->segment-writer: flush to segment files
segment-writer->>ra-server: notify flushed segments
ra-server-n-->ra-server-n: update mem-table-ranges
ra-server-n->>ets-server: delete range from mem-table
```

```mermaid
flowchart TD
WAL
RaServer1
RaServer2
RaServer1 -- Write (55, 57) --> WAL
RaServer2 -- Write (99, 98) --> WAL
WAL -- Written (55, 57) --> RaServer1
WAL -- Written (99, 98) --> RaServer2
WAL -- (RaServer1, [(55, 57)].. --> SegmentWriter
SegmentWriter -- New segments --> RaServer1
SegmentWriter -- New segments --> RaServer2
```

2 changes: 1 addition & 1 deletion src/ra.erl
Original file line number Diff line number Diff line change
Expand Up @@ -743,7 +743,7 @@ new_uid(Source) when is_binary(Source) ->

%% @doc Returns a map of overview data of the default Ra system on the current Erlang
%% node.
%% DEPRECATED: user overview/1
%% DEPRECATED: use overview/1
%% @end
-spec overview() -> map() | system_not_started.
overview() ->
Expand Down

0 comments on commit 77da4df

Please sign in to comment.