-
Notifications
You must be signed in to change notification settings - Fork 257
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
word-wise dma copy #175
base: master
Are you sure you want to change the base?
word-wise dma copy #175
Conversation
src/pi/pi_controller.c
Outdated
} | ||
if (((dram_address | rom_address | longueur) & 3) != 0) | ||
{ | ||
for(i = 0; i < longueur; ++i) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can I suggest to add a tab in from of the for loop to keep code readable?
From the manual:
So, to keep code clean, I suggest to remove old code and assume DMA are always 16 bits aligned. What do you think? |
Depends on what the console actually does. Manuals are known to be "best practice for stability/speed" instead of "the hardware requires this". I would keep the former behavior until someone pops in with a hardware test. |
Thinking about it, you're surely right, if it's technically possible we can't assume it will not happen... |
I'm a little hesitant to merge this because it increases the code complexity a fair amount, for a performance gain on the raspberry pi / arm architectures. Do you have any quantitative data on the performance improvement from using this aligned memcpy in real world scenarios? If it's just moving a few hundred or thousand bytes per VI then it's probably not worth it. But if these loops are run for millions of iterations per VI then it would be. |
The performance increase is 1 to 2 % faster (timing to frame 501), depending on ROM on a Pi B. It would be interesting to know if any ROMS require byte-wise copy as if none do then the word-wise only implementation would be simpler than the current byte-wise copy. |
There is merit to this. RSP DMA transfers must be 8-byte aligned. The N64 ignores the lower 3 bits of the source and destination address. It also ignores the lower 3 bits of the length (or rather, it assumes the lower 3 bits are all 1's, then it adds 1). As long as we properly mask the source and destination addresses, the transfers should always be 8-byte aligned. @ricrpi I don't know if you're still around. I'm far less confident in the alignment of the PI DMA's, but I am very confident in the 8-byte alignment of the RSP DMA's. If you update this PR to just to 8-byte aligned RSP DMA's (with no fallback) I'd be willing to do a fair bit of testing on it |
I am trying to keep up with all the commits/issues but have very limited time to do any coding at the moment :-( I will try to take a look at this though as it shouldn't take too long to patch in. |
If DMA transfers are word aligned then a more efficient copy can be done. I have not seen a dma transfer that is not word aligned however I have not extensively checked all roms so have left the safer byte-wise implementation in.