Rocksdb manual flush code changes #11849

neethuhaneesha · 2024-12-20T21:08:04Z

Rocksdb manual flush code changes

Code-Reviewer Section

The general pull request guidelines can be found here.

Please check each of the following things and check all boxes before accepting a PR.

The PR has a description, explaining both the problem and the solution.
The description mentions which forms of testing were done and the testing seems reasonable.
Every function/class/actor that was touched is reasonably well documented.

For Release-Branches

If this PR is made against a release-branch, please also check the following:

This change/bugfix is a cherry-pick from the next younger branch (younger release-branch or main if this is the youngest branch)
There is a good reason why this PR needs to go into a release branch and this reason is documented (either in the description above or in a linked GitHub issue)

foundationdb-ci · 2024-12-20T21:29:59Z

Result of foundationdb-pr-clang-ide on Linux CentOS 7

Commit ID: 08a49f1
Duration 0:21:44
Result: ✅ SUCCEEDED
Error: N/A
Build Log terminal output (available for 30 days)
Build Workspace zip file of the working directory (available for 30 days)

foundationdb-ci · 2024-12-20T21:56:03Z

Result of foundationdb-pr on Linux CentOS 7

Commit ID: 08a49f1
Duration 0:47:50
Result: ❌ FAILED
Error: Error while executing command: if python3 -m joshua.joshua list --stopped | grep ${ENSEMBLE_ID} | grep -q 'pass=10[0-9][0-9][0-9]'; then echo PASS; else echo FAIL && exit 1; fi. Reason: exit status 1
Build Log terminal output (available for 30 days)
Build Workspace zip file of the working directory (available for 30 days)

foundationdb-ci · 2024-12-20T21:58:59Z

Result of foundationdb-pr-clang on Linux CentOS 7

Commit ID: 08a49f1
Duration 0:50:43
Result: ❌ FAILED
Error: Error while executing command: if python3 -m joshua.joshua list --stopped | grep ${ENSEMBLE_ID} | grep -q 'pass=10[0-9][0-9][0-9]'; then echo PASS; else echo FAIL && exit 1; fi. Reason: exit status 1
Build Log terminal output (available for 30 days)
Build Workspace zip file of the working directory (available for 30 days)

foundationdb-ci · 2024-12-20T22:01:55Z

Result of foundationdb-pr-clang-arm on Linux CentOS 7

Commit ID: 08a49f1
Duration 0:53:40
Result: ✅ SUCCEEDED
Error: N/A
Build Log terminal output (available for 30 days)
Build Workspace zip file of the working directory (available for 30 days)

foundationdb-ci · 2024-12-20T22:04:49Z

Result of foundationdb-pr-cluster-tests on Linux CentOS 7

Commit ID: 08a49f1
Duration 0:56:32
Result: ✅ SUCCEEDED
Error: N/A
Build Log terminal output (available for 30 days)
Build Workspace zip file of the working directory (available for 30 days)
Cluster Test Logs zip file of the test logs (available for 30 days)

spraza · 2024-12-24T01:55:26Z

fdbserver/KeyValueStoreRocksDB.actor.cpp

+	// lastFlushTime is used by two threads. One Thread is reading the value and the other thread is updating the value.
+	// If the reader thread gets a wrong value due to race, that will be still fine in this case(probably an extra flush
+	// or no flush). Considering the cost of atomic, avoided it here in this case.


Generally an intentional race condition is tricky to reason about, especially because tools like TSAN would still flag them, so we'd have to find a way to suppress them. We don't use TSAN today but it could be useful in the future especially as rocksdb does background work off the main thread, and we're trying out grpc, etc.

Curious about the performance cost of atomic here, a good first order approximation is around ~10ns (upper bound could be around 100ns, depending on various factors e.g. if compiler uses atomic CAS and that has contention, there will be retries). So unless we're writing/reading lastFlushTime a lot of times and that too in the critical path, it should be ok to use. As an example, if you have a tight loop in the critical path which uses atomic, and that loop has 1M iterations/sec, the wallclock time overhead could be up 10-100ms, which is very high. But my understanding is that here lastFlushTime would be written by rocksdb event listener callback (amortized over 1 sec, the number of times we flush should be way less than 1), and lastFlushTime would be read roughly every ROCKSDB_MANUAL_FLUSH_TIME_INTERVAL (10 seconds?), so overall we're looking at a few ns overhead amortized over a second.

Thought to share in case it helps evaluating the tradeoff (perf cost vs reasoning about the race).

Rocksdb manual flush code changes

08a49f1

neethuhaneesha mentioned this pull request Dec 20, 2024

Rocksdb manual flush code changes #11845

Open

5 tasks

neethuhaneesha requested review from kakaiu and yao-xiao-github December 20, 2024 21:54

spraza reviewed Dec 24, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rocksdb manual flush code changes #11849

Rocksdb manual flush code changes #11849

neethuhaneesha commented Dec 20, 2024

foundationdb-ci commented Dec 20, 2024

foundationdb-ci commented Dec 20, 2024

foundationdb-ci commented Dec 20, 2024

foundationdb-ci commented Dec 20, 2024

foundationdb-ci commented Dec 20, 2024

spraza Dec 24, 2024 •

edited

Loading

Rocksdb manual flush code changes #11849

Are you sure you want to change the base?

Rocksdb manual flush code changes #11849

Conversation

neethuhaneesha commented Dec 20, 2024

Code-Reviewer Section

For Release-Branches

foundationdb-ci commented Dec 20, 2024

Result of foundationdb-pr-clang-ide on Linux CentOS 7

foundationdb-ci commented Dec 20, 2024

Result of foundationdb-pr on Linux CentOS 7

foundationdb-ci commented Dec 20, 2024

Result of foundationdb-pr-clang on Linux CentOS 7

foundationdb-ci commented Dec 20, 2024

Result of foundationdb-pr-clang-arm on Linux CentOS 7

foundationdb-ci commented Dec 20, 2024

Result of foundationdb-pr-cluster-tests on Linux CentOS 7

spraza Dec 24, 2024 • edited Loading

Choose a reason for hiding this comment

spraza Dec 24, 2024 •

edited

Loading