Fixes for pre-init recovery and handling of gaps in mem tables due to snapshotting #456

kjnilsson · 2024-07-18T12:37:09Z

The first thing a ra system does when it starts is run a pre-init
phase for each registered Ra server, mostly to recover the
ra_log_snapshot_state table. This appears to have been broken
for along time and the ra_log_snapshot_table has not been populated
ahead of WAL / segment writer recovery. This was fine as this bit
was just an optimisation and never affected the workings of
the Ra infrastructure.

However since 5b7a265 it needs
this in order to avoid the segment writer crashing when it detects
a gap (caused by the WAL dropping entries lower than the current
snapshot).

This commit mostly fixes the pre-init process but also addresses
a potential race condition which still could cause the segment
writer to crash for the same reason.

See: rabbitmq/rabbitmq-server#11712

src/ra_log_segment_writer.erl

the-mikedavis · 2024-07-18T16:15:58Z

tiny typo in the commit message & description: apperas => appears

The first thing a ra system does when it starts is run a pre-init phase for each registered Ra server, mostly to recover the ra_log_snapshot_state table. This appears to have been broken for along time and the ra_log_snapshot_table has not been populated ahead of WAL / segment writer recovery. This was fine as this bit was just an optimisation and never affected the workings of the Ra infrastructure. However since 5b7a265 it needs this in order to avoid the segment writer crashing when it detects a gap (caused by the WAL dropping entries lower than the current snapshot). This commit mostly fixes the pre-init process but also addresses a potential race condition which still could cause the segment writer to crash for the same reason. maybe de-flake

kjnilsson · 2024-07-19T10:09:26Z

tiny typo in the commit message & description: apperas => appears

fixed

kjnilsson changed the title ~~fixes~~ fixed to pre-init recovery and handling of gaps in mem tables due to snapshotting Jul 18, 2024

kjnilsson force-pushed the pre-init-bug-fixes branch from 4062585 to 8e2bbf8 Compare July 18, 2024 13:05

kjnilsson changed the title ~~fixed to pre-init recovery and handling of gaps in mem tables due to snapshotting~~ Fixes for pre-init recovery and handling of gaps in mem tables due to snapshotting Jul 18, 2024

kjnilsson requested a review from the-mikedavis July 18, 2024 16:00

the-mikedavis reviewed Jul 18, 2024

View reviewed changes

src/ra_log_segment_writer.erl Outdated Show resolved Hide resolved

kjnilsson force-pushed the pre-init-bug-fixes branch from 82c4cbc to a7ed36e Compare July 19, 2024 09:47

kjnilsson marked this pull request as ready for review July 19, 2024 10:03

kjnilsson merged commit 94cb3d2 into main Jul 19, 2024
9 checks passed

michaelklishin added this to the 2.13.2 milestone Jul 19, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fixes for pre-init recovery and handling of gaps in mem tables due to snapshotting #456

Fixes for pre-init recovery and handling of gaps in mem tables due to snapshotting #456

kjnilsson commented Jul 18, 2024 •

edited

Loading

the-mikedavis commented Jul 18, 2024

kjnilsson commented Jul 19, 2024

Fixes for pre-init recovery and handling of gaps in mem tables due to snapshotting #456

Fixes for pre-init recovery and handling of gaps in mem tables due to snapshotting #456

Conversation

kjnilsson commented Jul 18, 2024 • edited Loading

the-mikedavis commented Jul 18, 2024

kjnilsson commented Jul 19, 2024

kjnilsson commented Jul 18, 2024 •

edited

Loading