[server] ReadWriteLock on LeaderFollowerState #1251
+217
−71
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Background
TODO:
Changes
ReadWriteLock
to theLeaderFollowerState
inPartitionConsumptionState
produceToStoreBufferServiceOrKafka()
on the consumer's code path, the read lock must be acquired for the duration of the message's processing in order for the condition ofshouldProcessMessage()
to hold. Also applies toproduceToStoreBufferServiceOrKafkaInBatch()
.LeaderFollowerState
as part of a state transition would need to wait for this consumer thread to finish processing the message and release the read lock.testShouldProcessRecord()
which simulates the scenario where the consumer thread processes a batch of polled messages while another thread modifies the leader-follower state in the PCS. It's specifically testing a follower to leader transition and verifying that the leader-follower state in the PCS can't be modified while the consumer thread is processing messages.Correctness
TODO:
Performance Impact
produceToStoreBufferServiceOrKafka()
, because that is the only location where the lock is held for a long period of time. If a writer is waiting on the write lock, it must wait for all readers to release their locks. All future readers will also need to wait until the writer is done because the writer has priority, so they are also bottlenecked byproduceToStoreBufferServiceOrKafka()
.enum
value, the critical section should be instantaneous once it manages to acquire the lock, and any slowdown while locking the writer lock should be minimal.How was this PR tested?
CI
Does this PR introduce any user-facing changes?