Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't update the lights UBO if it hasn't changed #5972

Closed

Conversation

fluffyfreak
Copy link
Contributor

@fluffyfreak fluffyfreak commented Nov 17, 2024

After doing some profiling on the RPi:

  • SetLights cost 65% (34.6 MCycles out of 53.1) of a frame (new game -> Mars)
  • calculate and store the hash_64_fnv1a to see if lights have changed per-call
  • reduces cost to 0.0003 MCycles. Crude measure but fps went from 18 -> 25

This doesn't really remove the cost but it avoids updating the UBO needlessly if the data hasn't changed between calls to SetLights.

Real Results

In the best case this performs as above with a massive reduction in the cost of SetLights, for example on the pad at Mars start.
Really though this simply performs no worse most of the time and occassionally makes a large difference.

Tagging @Web-eWorks as this is an area you've been working on and I just tried this on a whim to see what would happen 🤷‍♂️ no worries if it's not suitable but I thought it worth trying.

Would you mind double checking the results when you have time?

Testing

Started on Mars, flew to Torvalds (Earth) station and docked, left and hyperspaced to Barnards Star, flew to SysAdminResting and docked. Everything seemed lit correctly at all times. Menus all worked.
Test on Windows and RPi5

- SetLights cost 65% (34.6 MCycles out of 53.1) of a frame
- calculate and store the hash_64_fnv1a to see if lights have changed per-call
- reduces cost to 0.0003 MCycles
Crude measure but fps went from 18 -> 25
@sturnclaw
Copy link
Member

@fluffyfreak can you check #5970 and see if you see a similar performance improvement? I strongly suspect the improvement you're seeing is because the GL driver has to sync with all of the prior GPU work before updating the buffer (and conditionally not updating the buffer will only solve the problem partially) - 5970 handles this in a different manner by removing the dependency against the previous frame entirely.

@sturnclaw
Copy link
Member

I'd appreciate a mid-frame zone trace from the RPi on #5970 btw, the PR contains some additional profiling points. Also note that on the RPi (since it's using the USE_CHRONO profiler backend) 1 MCycle == 1.0ms (and 1 cycle = 1ns).

@fluffyfreak
Copy link
Contributor Author

#5970 gives the same or better performance improvement, closing

@fluffyfreak
Copy link
Contributor Author

pioneer-RPi5.zip
@Web-eWorks here's a couple of profiles from the RPi5, Mars start, sat on the pad as always

@fluffyfreak fluffyfreak deleted the dont-update-lights branch November 17, 2024 20:55
@sturnclaw
Copy link
Member

Any differences between the two profiles? I'm seeing the later one is half the frame time - did you change graphics settings or view orientation?

@sturnclaw
Copy link
Member

Ah, I don't see a CityOnPlanet::Render in the second profile. Seems like we're close to hitting 60FPS on RPi5 in that profile though, which is something I'll happily take!

@fluffyfreak
Copy link
Contributor Author

2nd one I think I'd taken off actually 🤦‍♂️ so yeah probably a bit different!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants