Unable to reproduce iOS benchmark values #243

kanaukou-google · 2023-08-24T16:15:27Z

I am unable to reproduce claimed benchmark values for running Stable Diffusion model on iOS devices.

Configuration

MPB M1 Max 64G RAM macOS 13.5
Xcode 15.0.0 beta 5
iPhone 14 Pro iOS 17 beta 4

Steps to reproduce:

Build and run the example app from the sources with the said Xcode version on the said device.
Within the app press 'Generate', on completion observe the processing time in the label below.
Doesn't matter it's a cold or a consecutive run, the latency never falls below 11s which is ~20% slower than the claimed benchmarks for iPhone 14 Pro Max.

Could anyone kindly clarify is there anything specific I need to adjust in my setup in order to reproduce the results?

Should I use a different example Swift app?
Should I use specific Xcode 15 / iOS 17 beta versions were used?
Is there a difference between iPhone 14 Pro and iPhone 14 Pro Max that results in such discrepancy?

atiorh · 2023-08-24T19:11:11Z

Hello! Your setup sounds accurate to me. Which version of Stable Diffusion are you benchmarking?

kanaukou-google · 2023-08-24T22:04:29Z

I made no changes to the example app, and it seems to use stable-diffusion-2-1-base-palettized for my setup while benchmarks refer to stable-diffusion-2-1-base.

atiorh · 2023-08-25T06:59:26Z

That is also correct. A few things:

We have benchmarked with iOS17 Seed 1 and 7f9c58a (commit on main with 1.0.0 release) so if any performance number is off with Seed 5 and current main commit, it would be a regression we need to investigate
@pcuenca Do you mind testing with Seed 5 to see if your previous numbers on 14 Pro regress too?
@kanaukou-google Are you observing the 2.3 iter/sec or is that also lower? It will give us a sense of whether ML perf degraded or some non-ML code is slower for some yet unknown reason.

kanaukou-google · 2023-08-25T16:50:59Z

@atiorh I added log above L117 to print the supposed iter/sec value, got values around ~2.72 for several consecutive runs.

atiorh · 2023-08-27T06:50:21Z

Hmm, that sounds even faster than what we published (2.72 vs 2.3 iter/sec) and it should have finished in ~8 seconds with that throughput. I will wait for Pedro to repro his measurements from June and also rerun our measurements on Seed 5 this week.

atiorh · 2023-08-27T06:54:22Z

@kanaukou-google Oh one more thing, could you please verify that reduceMemory is not enabled? It will add 1-2 seconds for loading/unloading resources during generation.

pcuenca · 2023-08-28T10:05:21Z

I no longer have access to the iPhone 14 Pro I used for the tests, but I repeated them on my iPhone 13 Pro running iOS 17 beta 7 (21A5319a). Some observations:

reduceMemory was indeed defaulting to true because of this test. This beta of iOS 17 reports 5917753344 bytes of physicalMemory. I knew this varies among devices, but this number is lower than the lower threshold I had set before.
The number of scheduling steps we used to benchmark is 20, whereas the app's default is 25.
The app now uses in-progress previews that we need to disable to replicate the original benchmark testing conditions.

To reduce ambiguity, I pushed this branch to replicate the benchmark conditions (to the best of my recollection) using the latest code.

Using that branch, I got the following results on 5 consecutive runs using Xcode 15.0 beta 7 on iPhone 13 Pro running iOS 17 beta 7 (21A5319a):

time	9.8	9.5	9.0	9.1	9.6
it/s	2.31	2.36	2.49	2.48	2.24

9.5 is faster than the original 12s observed for the same device back in June.

Also observe that I'm running tests after reboot, waiting for the device to cool, and detached from Xcode.

pcuenca · 2023-08-28T10:36:30Z

I can also repeat the tests on seed 5 of Xcode if that's useful.

kanaukou-google · 2023-08-28T16:22:50Z

I checked out @pcuenca's PR and tried reproducing it on a cool rebooted 14 Pro beta 4 detached from Xcode.

Below are results for 5 consecutive runs (meaning 5 times pressed 'Generate' button after processing is complete without restarting the app or changing the prompt).

time	7.9	7.9	7.9	7.9	8.0
it/s	2.69	2.69	2.68	2.69	2.69

The results look even better than the benchmark values! I wonder if such a consistency of .1s for time and .01s for it/s values for 5 runs is expected, though?

atiorh · 2023-08-28T16:42:17Z

Thank you both for the time spent on this @pcuenca @kanaukou-google! Our inference stack is consistently improving! We will rerun our internal benchmarks with the latest public seed of iOS 17 and update our numbers.

kanaukou-google · 2023-08-28T16:53:36Z

No problem, glad we figured this out! Just one quick question, what would be the best approach to approach the benchmarking in future? Looks like some changes from @pcuenca's PR need to be applied to get proper results.

pcuenca · 2023-08-28T18:38:46Z

That's a good point @kanaukou-google! I'll add a BENCHMARK constant to control the configuration, add an entry to the README and merge the PR. We'll still need to pay attention when we introduce features that may impact the benchmark code.

atiorh · 2023-08-30T21:05:04Z

Updated the benchmarks with the latest public seed using @pcuenca 's benchmarking branch. Thanks @TBPer for the benchmarking runs!

atiorh closed this as completed in a56e102 Aug 30, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unable to reproduce iOS benchmark values #243

Unable to reproduce iOS benchmark values #243

kanaukou-google commented Aug 24, 2023

atiorh commented Aug 24, 2023

kanaukou-google commented Aug 24, 2023

atiorh commented Aug 25, 2023

kanaukou-google commented Aug 25, 2023

atiorh commented Aug 27, 2023

atiorh commented Aug 27, 2023

pcuenca commented Aug 28, 2023 •

edited

Loading

pcuenca commented Aug 28, 2023

kanaukou-google commented Aug 28, 2023

atiorh commented Aug 28, 2023 •

edited

Loading

kanaukou-google commented Aug 28, 2023

pcuenca commented Aug 28, 2023

atiorh commented Aug 30, 2023 •

edited

Loading

Unable to reproduce iOS benchmark values #243

Unable to reproduce iOS benchmark values #243

Comments

kanaukou-google commented Aug 24, 2023

atiorh commented Aug 24, 2023

kanaukou-google commented Aug 24, 2023

atiorh commented Aug 25, 2023

kanaukou-google commented Aug 25, 2023

atiorh commented Aug 27, 2023

atiorh commented Aug 27, 2023

pcuenca commented Aug 28, 2023 • edited Loading

pcuenca commented Aug 28, 2023

kanaukou-google commented Aug 28, 2023

atiorh commented Aug 28, 2023 • edited Loading

kanaukou-google commented Aug 28, 2023

pcuenca commented Aug 28, 2023

atiorh commented Aug 30, 2023 • edited Loading

pcuenca commented Aug 28, 2023 •

edited

Loading

atiorh commented Aug 28, 2023 •

edited

Loading

atiorh commented Aug 30, 2023 •

edited

Loading