-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rocksdb manual flush code changes #11849
base: main
Are you sure you want to change the base?
Conversation
Result of foundationdb-pr-clang-ide on Linux CentOS 7
|
Result of foundationdb-pr on Linux CentOS 7
|
Result of foundationdb-pr-clang on Linux CentOS 7
|
Result of foundationdb-pr-clang-arm on Linux CentOS 7
|
Result of foundationdb-pr-cluster-tests on Linux CentOS 7
|
// lastFlushTime is used by two threads. One Thread is reading the value and the other thread is updating the value. | ||
// If the reader thread gets a wrong value due to race, that will be still fine in this case(probably an extra flush | ||
// or no flush). Considering the cost of atomic, avoided it here in this case. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally an intentional race condition is tricky to reason about, especially because tools like TSAN would still flag them, so we'd have to find a way to suppress them. We don't use TSAN today but it could be useful in the future especially as rocksdb does background work off the main thread, and we're trying out grpc, etc.
Curious about the performance cost of atomic here, a good first order approximation is around ~10ns (upper bound could be around 100ns, depending on various factors e.g. if compiler uses atomic CAS and that has contention, there will be retries). So unless we're writing/reading lastFlushTime
a lot of times and that too in the critical path, it should be ok to use. As an example, if you have a tight loop in the critical path which uses atomic, and that loop has 1M iterations/sec, the wallclock time overhead could be up 10-100ms, which is very high. But my understanding is that here lastFlushTime
would be written by rocksdb event listener callback (amortized over 1 sec, the number of times we flush should be way less than 1), and lastFlushTime
would be read roughly every ROCKSDB_MANUAL_FLUSH_TIME_INTERVAL (10 seconds?)
, so overall we're looking at a few ns overhead amortized over a second.
Thought to share in case it helps evaluating the tradeoff (perf cost vs reasoning about the race).
Rocksdb manual flush code changes
Code-Reviewer Section
The general pull request guidelines can be found here.
Please check each of the following things and check all boxes before accepting a PR.
For Release-Branches
If this PR is made against a release-branch, please also check the following:
release-branch
ormain
if this is the youngest branch)