Update Benchmark set up and persist benchmarks #180

mimischi · 2024-11-25T13:09:23Z

We have recently updated our toolchain version to 5.9, but have never updated the benchmark suite to use 5.9 or newer. Also, it does not look like we ever commited a baseline to this repository.

While at it, wiring up GitHub actions to run the benchmarks for every pull request, as well as on a merge into main. Re-uses a lot of the heavy lifting that was defined in apple/swift-nio already.

Note that nightly-main appears broken at the moment. The Benchmark tool gets stuck, and initial debugging point to it never finishing _sendAndAcknowledgeMessages when producing a set of messages before every benchmark.

We have recently updated our toolchain version to 5.9, but have never updated the benchmark suite to use 5.9 or newer. Also, it does not look like we ever commited a baseline to this repository. While at it, wiring up GitHub actions to run the benchmarks for every pull request, as well as on a merge into `main`. Re-uses a lot of the heavy lifting that was defined in `apple/swift-nio` already. Note that `nightly-main` appears broken at the moment. The Benchmark tool gets stuck, and initial debugging point to it never finishing `_sendAndAcknowledgeMessages` when producing a set of messages before every benchmark.

mimischi · 2024-11-25T14:18:41Z

Right. So the benchmark performs worse on GitHub Actions than on my local machine. What's the blessed approach of introducing a baseline then?

Lukasa · 2024-11-25T14:50:06Z

Extract the values from GH actions and use those as the baseline, sadly.

rnro

Overall this looks really good, it'll be great to have the benchmarks up and running in CI. Just some small comments.

Unfortunately as Cory pointed out we don't have a great story for updating the thresholds right now but it's something I'm hoping to improve.

.github/workflows/unit_tests.yml

.github/workflows/pull_request.yml

.github/workflows/cxx_interop.yml

.github/workflows/main.yml

.github/workflows/pull_request.yml

.github/workflows/main.yml

.github/workflows/benchmarks.yml

Removing the workaround comment

rnro · 2024-11-25T16:24:04Z

...s/5.10/SwiftKafkaConsumerBenchmarks.SwiftKafkaConsumer_basic_consumer_messages_1000.p90.json

+{
+  "allocatedResidentMemory" : 77266944,
+  "cpuTotal" : 200000000,
+  "objectAllocCount" : 5549,
+  "releaseCount" : 15168,
+  "retainCount" : 7108,
+  "retainReleaseDelta" : 2511,
+  "throughput" : 2,
+  "wallClock" : 695307500
+}


In most of our repositories we only add benchmarks for mallocCountTotal because it is a stable measure. We don't yet have dedicated runners which would be required for time-based benchmarks

see e.g. https://github.com/apple/swift-nio/blob/main/Benchmarks/Thresholds/5.9/NIOCoreBenchmarks.NIOAsyncChannel.init.p90.json

Changes the benchmark to iterate over the consume loop a total of 1000 times. We now produce 1000*1000 (1e6) messages into the topic, and on every benchmark iteration consume 1000 messages each. I had to comment the `librdkafka_with_offset_commit_messages_*` benchmark. For some reason, the benchmark suite keeps re-running it, and I have seen occasional failures when attempting to commit offsets. It's unclear to me why that happens right now, but decided it's not worth the investigation at the moment.

`swift build` was happy for whatever reason beofre.

Keeps `scalingFactor: .kilo`, but only measures `.mallocCountTotal`, as that should be a reproducible value across different systems.

mimischi requested review from FranzBusch and rnro November 25, 2024 13:09

mimischi added the semver/none No version bump required. label Nov 25, 2024

rnro reviewed Nov 25, 2024

View reviewed changes

mimischi added 2 commits November 25, 2024 16:03

Address reviewer comment

1197894

Removing the workaround comment

Remove dangling @main from GH action

cf7bfc4

rnro reviewed Nov 25, 2024

View reviewed changes

mimischi added 4 commits November 26, 2024 14:32

Run swift format

415b7cb

Fix type typo

62ae8e2

`swift build` was happy for whatever reason beofre.

Only measure mallocCountTotal in Benchmarks

baa1e9e

Keeps `scalingFactor: .kilo`, but only measures `.mallocCountTotal`, as that should be a reproducible value across different systems.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update Benchmark set up and persist benchmarks #180

Update Benchmark set up and persist benchmarks #180

mimischi commented Nov 25, 2024

mimischi commented Nov 25, 2024

Lukasa commented Nov 25, 2024

rnro left a comment

rnro Nov 25, 2024

Update Benchmark set up and persist benchmarks #180

Are you sure you want to change the base?

Update Benchmark set up and persist benchmarks #180

Conversation

mimischi commented Nov 25, 2024

mimischi commented Nov 25, 2024

Lukasa commented Nov 25, 2024

rnro left a comment

Choose a reason for hiding this comment

rnro Nov 25, 2024

Choose a reason for hiding this comment