Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

trim the performance measurement data to remove artifacts of "startup" and "cooldown" while reporting and evaluating #243

Open
jtluka opened this issue May 19, 2022 · 6 comments

Comments

@jtluka
Copy link
Collaborator

jtluka commented May 19, 2022

When an ENRT recipe measures performance, for example Recipes.ENRT.SimpleNetworkRecipe, there are two artifacts that can impact the reported and eventually evaluation of both the CPUStatMeasurement and IperfFlowMeasurement results.

For example, the following graph on the left side shows CPUStatMeasurement samples (1 sample/second) from 5 iterations. Each iteration differs in the beginning and end of the measurement - this is the source of variance when the recipe is ran and makes comparison of these runs complicated.

good cpu

The solution could be to add specified time to both beginning and end of the measurement and then trim the measured samples from the beginning and end when reporting and evaluating.

For example the added time would be specified by perf_duration_warmup parameter. The total iperf duration then would be perf_duration + perf_duration_warmup*2. Once the measurement completes, the samples would be trimmed to match perf_duration_warmup, perf_duration timestamps.

@jtluka
Copy link
Collaborator Author

jtluka commented May 19, 2022

Just one more note, that for IperfFlowMeasurement it is not possible to do that on the reported CPU usage samples, because these are not sampled per second. Only throughput samples can be trimmed that way.

@enhaut
Copy link
Member

enhaut commented May 25, 2022

Do we want to cut just some constant number of samples? There is also way to detect the warmup and cut just these samples.

@olichtne
Copy link
Collaborator

detection could be difficult so i think a constant cut is probably going to be fine, however if you have a good proposal for how to detect this i'd be interested in that version as well

@jtluka
Copy link
Collaborator Author

jtluka commented May 25, 2022

I'd like to have a deterministic way first, so cutting specific number of samples is ok atm. Some automated detection could be prone to unexpected behaviour of a test, so I'm a bit sceptic about it. Still, if you have a proposal, create a separate issue and you can work on that once the simple cut mechanism is implemented and merged.

@jtluka
Copy link
Collaborator Author

jtluka commented May 25, 2022

From our experience, cutting 2-3 samples from start/end of a test is a good starting point.

@enhaut
Copy link
Member

enhaut commented Jul 26, 2022

Implemented in #248

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants