Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FIRE 35011] Weird patterned extreme CPU usage when using more than 6gb vram on 10g card - WIP Test #66

Closed
wants to merge 5 commits into from

Conversation

minerjr
Copy link
Contributor

@minerjr minerjr commented Jan 16, 2025

Hi, so I have been looking into this issue [https://jira.firestormviewer.org/browse/FIRE-35011]( FIRE 35011 ) which is related to I think other user's performance issues that have been submitted.

I create a couple of videos walking though what the issue is and my solution. This is a work in progress but wanted to get some feedback on the approach.

[https://youtu.be/ulHrdk4Wc8A](FIRE 35011 Progress - Part 1)
[https://youtu.be/NCqOwPDU-9Q](FIRE 35011 Progress - Part 2))

Showing the CPU not spiking from switching from High memory usage to lowered.
FIRE-35011-Progress

The issue stems from low "VRAM" status, where the Bias starts to increase and the system has an initial 1.5 spike that is designed to purge the off screen textures and 2.0 where in the
void LLViewerTextureList::updateImageDecodePriority(LLViewerFetchedTexture* imagep, bool flush_images)
Method, the system then forces on screen textures to down scale.

Also, the system uses the bias to determine how many objects to work on. The higher the value the more textures it works on. Up to 80% of the entire texture pool in 1 frame is selected to be processed. This is what adds all the fetch's when the bias is greater then 2, even textures in camera get get downscaled.

Then the cycle begins. Now that the textures have been deleted, the bias decreases, and then now there is free memory and as in
F32 LLViewerTextureList::updateImagesFetchTextures(F32 max_time), it does another update with all 40% of the textures and now they all at once get a request for higher texture memory size (mDesiredDiscard where 0 is the max resolution and 5 (MAX_DISCARD) is the upper limit, but in OpenGL the numbers are in reverse...

They all suddenly start creating new textures and trying to load from the various cashes and causes out of memory and once again the system then flushes them all down and rinse and repeat.

I have been trying a few various things and I came up with using a memory pool for the deleted objects. The biggest performance killer is newing/deleting memory and re-setting up the objects and connections. Instead of deleting like it does currently, in the function I move the texture to a new mUUIDDeleteMap object similar to the mUUIDMap already used by the LLViewerTextureList. I then update the find methods for any call which requests the textures to search the normal memory first and if not found, check the delete pool for the texture and if found, re-add it to the main texture pool. This in itself had a tremendous perforce uplift for my viewer.

I also made it so that the increase of checks only happens while the bias is increasing and not all the time.

I also changed it so that textures now have a mini state machine which tracks what happened to them in their life time. This can be further used to refine the behaviors of what to do with deleted memory, how to handle edge cases and know if a texture was deleted just from regular use or from a memory overage event. I also add a delay for when a texture can be updated again to try to spread out the load more to quell the spikes.

When a texture is downscaled or deleted and brought back, they are delayed on how quickly they will try to up-res to try and help with the surging of the requests.

I also did a test of faking the amount of ram I had and doubled the report available RAM and it ran well.

Created a shadow mUUIDMap called mUUIDDeleteMap which contains a key/image pair of any object that is deleted. When a texture is deleted, it not moves over to this list, and is removed from the normal mUUIDMap and mImageList.
Currently the callbacks are still there, but not being called by not being in the main lists.
Added the reduction of calls as bias falls, adding flag to turn on/off the new feature.
Fixed issue of not disconnecting the textures on shutdown.
Added new handler for when memory runs low.
Stopped deleting textures and instead put delete detects on a separate map which then can be later referenced and restored when the same texture request comes back again.
Added sFreeVRAMMegabytes = llmax(target - used, 0.001f); fix that LL implemented.
Added state object for the LLVIewerTexture memory (Fetched and LOD)
Fixed up the comments to include the full JIRA issue.
Added some comments on possible further investigations and tests if needed.
Update to have more checks around new code for enable/disable the feature.
Re-added checks on updatefetch to limit the deleted and scaled texture desired.
Added comment out code to automatically delete data in the delete UUIDMap, but want to test out the current system as is.
@beqjanus
Copy link
Contributor

beqjanus commented Jan 16, 2025

Thanks, I've pulled the change in to my local repo and will test it in the morning.
As mentioned in discussions, we need to think about how this works as we've escalated this issue to LL who will doubtless make their own changes (there is one change already in ForeverFPS) I think that with it logically separated and configurable that gives us a good opportunity to test them side by side.

@beqjanus
Copy link
Contributor

I'm not sure that it is working right for me, but it is hard to know as I never really saw the issues before (not reliably at least)
Limiting the texture memory never seems to work for me. I've not explored why or whether that is just me and some rogue setting.

Legacy algorithm

FIRE-35011

This was at Warehouse 21, busy high texture load.

I TP'd home, and the texture memory never dropped. despite it being very low texture and mesh (my home is predominantly a 2008 build with a handful of my newer things around)
back home - comparatively low load

after a restart

On the current beta (so existing behaviour), the texture usage does revert back to that stable number for that region after a TP; the new version seems to hold on to things.

Current beta version at busy venue

Current beta version after returning

As you'd expect, there is more texture fetch activity with the old method/beta, so there is definite improvement in that respect the memory profile though is concerning (or I am misreading it)

Some change, but not working well currently.
@Ansariel
Copy link
Collaborator

LL is working in the same area in ForeverFPS right now, so it doesn't seem to make much sense to fiddle with this issue in the current master branch. I even might have to revert the entire changes outright when merging master into the ForeverFPS merge repo.

@minerjr
Copy link
Contributor Author

minerjr commented Jan 16, 2025

Yeah, I am going to pull the test, it seems to be not working quit right anyway. Thanks for the feed back.

@minerjr
Copy link
Contributor Author

minerjr commented Jan 16, 2025

Closing this pull as not performing how I was expecting and needs more work.

@minerjr minerjr closed this Jan 16, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants