High overhead of power measurements for CPU inference measurements #236

psyhtest · 2021-05-05T14:16:06Z

During the MLPerf Inference v1.0 round, I noticed that the power workflow when used with CPU inference occasionally seemed to incur a rather high overhead (~10%), for example:

Xavier with power measurements:
- ArmNN v21.02: 68.5 ms
- TFLite v2.4.1: 78.0 ms
Xavier without power measurements (compliance):
- ArmNN v21.02: 60.2 ms
- TFLite v2.4.1: 78.5 ms

Here, ArmNN is faster than TFLite but takes a big hit under the power workflow. TFLite, however, is not affected.

psyhtest · 2021-05-05T15:36:52Z

The observed behaviour is consistent across several runs:

Raspberry Pi 4 with power measurements:
- TFLite v2.4.1: 606.3 ms (power/testing), 602.4 ms (power/ranging).
Raspberry Pi 4 without power measurements:
- TFLite v2.4.1: 552.3 ms (compliance/TEST05), 550.4 ms (compliance/TEST01).

s-idgunji · 2021-05-05T16:08:16Z

Hi Anton - On Jetson AGX, we can check our runs with and without power to see if we see an impact of the power related SW overheads in communicating to the director PC. I will comment back on this issue.

psyhtest · 2021-05-15T17:41:20Z

@s-idgunji When running inference on the GPU, the CPU is typically not fully utilised, so you might not notice any difference. When running inference on the CPU in full-blast, the effect may be more pronounced.

s-idgunji · 2021-05-19T04:38:43Z

@psyhtest - Thanks, we were not able to notice a difference when we had workload going on GPU on the Xavier config that we submitted, but I agree that when you are using CPU only, then the effect may show up. We can independently try and replicate the behavior you observed. But we'll need to close on this soon and propose a fix that can go in so that we can validate prior the freeze date.

TheKanter · 2021-06-02T16:39:20Z

Anton - RP4 is pretty low-end CPU. I don't see any way around some sort of overhead. Is the overhead consistent or are you saying it's unpredictable?

psyhtest · 2021-06-02T19:15:59Z

The overhead seems to be consistent. What's worrying is that it shows up with ArmNN on Xavier which is like 8x more powerful than RPi4.

s-idgunji · 2021-06-09T06:33:47Z

Anton - Can you try on another system , perhaps Intel Celeron or Core based. Also , what can we suggest as a fix ? If we want to repro on our system at NVIDIA what specific config , benchmark/scenario would likely give the highest difference. We could test it out to see if it repros.

araghun · 2021-06-16T16:49:15Z

Power WG Thoights:

For v1.1: Try and solve it opportunistically. It is not a gating items.
For further revisions: We could take a look into it.

This is mainly due to the resources available to solve this problem. Main issue is there is no minimum bar for the software flow.

araghun · 2021-07-14T16:25:31Z

MLPerf Power WG: This will be punted to after v1.1. issue will remain open.

s-idgunji · 2022-08-05T02:20:07Z

@psyhtest - perhaps you want to close this if the issue is not valid anymore, or we need to revisit in the WG. Quite an old issue.

arjunsuresh · 2022-12-25T18:56:05Z

Is this issue still happening?

psyhtest self-assigned this May 5, 2021

psyhtest added the Investigate Investigate/Research label May 5, 2021

psyhtest added this to the v1.1 milestone May 5, 2021

psyhtest mentioned this issue Aug 3, 2021

Yokogawa WT310E ranging failures at under 0.2 A #253

Closed

arjunsuresh mentioned this issue May 4, 2023

Fixes #288, #289, inference_repo issue number 1335 #298

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

High overhead of power measurements for CPU inference measurements #236

High overhead of power measurements for CPU inference measurements #236

psyhtest commented May 5, 2021

psyhtest commented May 5, 2021 •

edited

Loading

s-idgunji commented May 5, 2021

psyhtest commented May 15, 2021

s-idgunji commented May 19, 2021

TheKanter commented Jun 2, 2021

psyhtest commented Jun 2, 2021

s-idgunji commented Jun 9, 2021

araghun commented Jun 16, 2021

araghun commented Jul 14, 2021

s-idgunji commented Aug 5, 2022

arjunsuresh commented Dec 25, 2022

High overhead of power measurements for CPU inference measurements #236

High overhead of power measurements for CPU inference measurements #236

Comments

psyhtest commented May 5, 2021

psyhtest commented May 5, 2021 • edited Loading

s-idgunji commented May 5, 2021

psyhtest commented May 15, 2021

s-idgunji commented May 19, 2021

TheKanter commented Jun 2, 2021

psyhtest commented Jun 2, 2021

s-idgunji commented Jun 9, 2021

araghun commented Jun 16, 2021

araghun commented Jul 14, 2021

s-idgunji commented Aug 5, 2022

arjunsuresh commented Dec 25, 2022

psyhtest commented May 5, 2021 •

edited

Loading