-
Notifications
You must be signed in to change notification settings - Fork 160
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Engine: refactoring BitmapToVideoMem #2190
Engine: refactoring BitmapToVideoMem #2190
Conversation
I think it's best to start with turning these function into template, without the commit that splits into 3 functions, because otherwise this pr first does one change, and then completely different change overriding first. EDIT: commit called "adjust videomemimpl pointer conventions" is wrong, because it edits the code to opposite of our convention (our code style convention is to have Making "linear filtering" a template arg seems like a good idea. EDIT: there's also another function nearby called BitmapToVideoMemOpaque (i think), its code may be turned into template too probably. |
Thanks, I will adjust and also do templates on the opaque variant too. Did you get some speed up with in the test game too? I was more looking into speed up things but it's nice we get shorter code. The |
fb6ae63
to
7afc34a
Compare
Adjusted the PR commits! (sorry for the few push-forces, but I think it's done now!)
I also included that too this time! |
The "benchmark" test shows an increase of only about 1 fps for me, from 12.6 to 13.5 (it's rather stable), running with provided config (d3d9, linear filter). This seem practically negligible improvement, much less then the stats posted in this ticket (22 to 26 fps). This might depend on CPU and other factors maybe.
Majority of games made before 2010 were 16-bit, they may be found in AGS game database.
I have vague memories of seeing the problems caused by this, but I cannot remember where, and running random game without this variable does not show any immediate problems. Some investigation may be necessary. Maybe it's not just this variable, but whole In regards to optimizing this, if your aim is to avoid
This means that if factor is set to 0, then the [x-1] pixel keeps full value from itself, and 0 value from [x], and if factor is set to 0xFFFFFFFF, then [x-1] gets 0 from itself and full value from [x]. Unfortunately, it seems that this cannot be applied directly in this code, as you must to test that x > 0 (otherwise this may cause array underflow). Overall, checking for surface bounds result in most conditions there (for the linear filter case), so if that may be restructured somehow maybe it will result in higher speed. But idk if it's worth it. |
The black border issue can be evidenced by making a game in some low resolution (say 320x180) and then having it set to use linear filtering, and run it at a bigger resolution. Then one needs to import assets that are 32-bit, with parts that are fully transparent and parts that aren't fully transparent with some color different than black. Say a color like 0xF000F0FF (pink) now have the fully transparent pixels be 0x00000000. What happens is the linear interpolation from Directx/OpenGL will make the interpolation between all parts (R,G,B,A), meaning R and B in this case will approximate to 0, as the alpha also approaches, resulting in a black border. I will make a project that showcases this issue. (Ah, this could be fixed in the Editor at import time too Edit: actually it can’t because we use magic pink color for fully transparent pixels, I don't remember if this is replaced at runtime or editor import.) I have a lot of difficulty comprehending these RGBA maths (or ARGB, I don't remember what video uses), but from what I read online premultiplied alpha offers some fix to this somehow, but I couldn't understand. I found it weird you did not get as much improvement as I with what is there already. :/ I don't plan to add anything to this PR right now. |
I tested 16-bit and 8-bit games, and things seem to work as before (in terms of visuals). So I will merge this, as this may have a benefit of having a single universal algorithm. |
Hi, tried a bit of moving code around in BitmapToVideoMem. In my tests, this benchmark below that clears fullscreen bitmap and pushes to memory using an Overlay went from 22.6 FPS using what we currently have on master to 26.9 FPS after the modifications from this PR. It's nothing ground breaking, but still somewhat an improvement. Please verify this improvement.
Test Game: ClearBenchMark.zip
Project: ClearBenchmarkProject.zip
I am not sure about the many commits and how to organize them or if I should merge them in a single one, and also if the code that I got to in the end is acceptable.
Ah, I don't know any 16-bit game, so I didn't test the 16-bit path, it would be nice to get more testing from this branch.
I also noticed that if original source bitmap had a 1 pixel transparent margin, the linear interpolation would not require all checks it has.