trim the performance measurement data to remove artifacts of "startup" and "cooldown" while reporting and evaluating #243

jtluka · 2022-05-19T12:56:45Z

When an ENRT recipe measures performance, for example Recipes.ENRT.SimpleNetworkRecipe, there are two artifacts that can impact the reported and eventually evaluation of both the CPUStatMeasurement and IperfFlowMeasurement results.

For example, the following graph on the left side shows CPUStatMeasurement samples (1 sample/second) from 5 iterations. Each iteration differs in the beginning and end of the measurement - this is the source of variance when the recipe is ran and makes comparison of these runs complicated.

The solution could be to add specified time to both beginning and end of the measurement and then trim the measured samples from the beginning and end when reporting and evaluating.

For example the added time would be specified by perf_duration_warmup parameter. The total iperf duration then would be perf_duration + perf_duration_warmup*2. Once the measurement completes, the samples would be trimmed to match perf_duration_warmup, perf_duration timestamps.

The text was updated successfully, but these errors were encountered:

jtluka · 2022-05-19T13:05:01Z

Just one more note, that for IperfFlowMeasurement it is not possible to do that on the reported CPU usage samples, because these are not sampled per second. Only throughput samples can be trimmed that way.

enhaut · 2022-05-25T11:14:56Z

Do we want to cut just some constant number of samples? There is also way to detect the warmup and cut just these samples.

olichtne · 2022-05-25T11:25:10Z

detection could be difficult so i think a constant cut is probably going to be fine, however if you have a good proposal for how to detect this i'd be interested in that version as well

jtluka · 2022-05-25T11:29:20Z

I'd like to have a deterministic way first, so cutting specific number of samples is ok atm. Some automated detection could be prone to unexpected behaviour of a test, so I'm a bit sceptic about it. Still, if you have a proposal, create a separate issue and you can work on that once the simple cut mechanism is implemented and merged.

jtluka · 2022-05-25T11:30:36Z

From our experience, cutting 2-3 samples from start/end of a test is a good starting point.

enhaut · 2022-07-26T10:21:01Z

Implemented in #248

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

trim the performance measurement data to remove artifacts of "startup" and "cooldown" while reporting and evaluating #243

trim the performance measurement data to remove artifacts of "startup" and "cooldown" while reporting and evaluating #243

jtluka commented May 19, 2022

jtluka commented May 19, 2022

enhaut commented May 25, 2022

olichtne commented May 25, 2022

jtluka commented May 25, 2022

jtluka commented May 25, 2022

enhaut commented Jul 26, 2022

trim the performance measurement data to remove artifacts of "startup" and "cooldown" while reporting and evaluating #243

trim the performance measurement data to remove artifacts of "startup" and "cooldown" while reporting and evaluating #243

Comments

jtluka commented May 19, 2022

jtluka commented May 19, 2022

enhaut commented May 25, 2022

olichtne commented May 25, 2022

jtluka commented May 25, 2022

jtluka commented May 25, 2022

enhaut commented Jul 26, 2022