-
Notifications
You must be signed in to change notification settings - Fork 97
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Consider splitting the disassembly into multiple files in a broader and more systematic way #53
Comments
This is a very good point and a constant debate whenever disassemblies are put together. The easiest way I can explain it is: Our disassemblies by and in large are meant to be bit-perfect, and achieving that while splitting files often means the files are split in a really dumb way, because random routines will be inside random objects. This disassembly handles this fairly poorly, though it could also be far worse, too. A single file may be split into 2 because a common library routine is in the middle. We would either have to figure out a work around for this that would leave the disasembly bit perfect, or ditch bit-perfectness and possibly introduce more bugs into the games. Furthermore, this would also lead to people complaining that its hard to find anything (as often people do with this disassembly) and no matter of structuring disassemblies well will make people fully happy. The real answer is: Nobody can agree exactly on what a disassembly should be like, and we're still debating things instead of trying to make better ones, whether alone or in smaller groups that agree. We would need people who are interested and mostly agreeing on what exactly to do. This has been proposed several times by many community figures, but so far things have fallen through |
The S3K disassembly definitely needs more splitting though 90% of the game code is in the main file making locating specific things a huge pain |
Should be worth noting that the Sonic 2 diassembly could be a potential good reference to see how the source code was actually split up. Through that debug mode code leak, and other things found inside the Nick Arcade proto, and also what's known about REV02, a lot of those JmpTos were generated by the assembler. They were appened at the end of file's code, and as such, those JmpTos can be used to identify where an original source file ends. With that, estimated guesses can be applied to both Sonic 1 and 3. I know that this was discussed in s2disasm, but it's worth mentioning here. The Sonic object, for instance, can be assumed to be 1 file. Starting from the top of the object code to where the next object's code starts. The collision functions were actually a separate file for holding general "floor collision" files (I think it was called FCOL.ASM or something like that). In my opinion, some of the splitting choices in the current version are a bit ridiculous. I don't think single functions need their own file, nor do I think the Sonic object needs like 10. |
and the ELF files for the gems collection version of SCD at least for the main engine itself |
Could actual code be assigned to whatever filenames were left in those ELF files? Otherwise, no, not really, you just get the symbol data, and I'm not quite sure if that's really within the scope for the disassembly (besides historical value). |
I once decompiled a Linux game that had leftover debug data in its ELF file which did assign symbols to filenames. Unfortunately, I don't recall which 'objdump' command I used to extract it, and I don't know if SCD's ELFs contains that data as well. |
I'm fairly sure that if paths are present, debug data is present too |
I know that, I mean that that debug data might not contain symbol-path associations. The ELFs of Sonic CD and the ELF of the game I decompiled were made almost ten years apart. |
ah |
"-l" displays filenames and line numbers. I ran it on R11A.ELF and it indeed has the information. For example:
|
Heck yeah! |
epic at this point we can just completely decompile the GEMS version of Sonic CD down to the line number level |
Not exactly to the line number, but it gives a generally good guide to how the code was set up. A Sonic CD C decomp would be interesting, but that's for a different place. Regardless, it can help out with figuring out how the original source files from Sonic CD, and to an extent, Sonic 1, were set up. |
I would like to add something regarding ROM sections. I've been taking a look at Sonic Jam, and I realized that each game has been split up into different files. In Sonic 1's case, there's "AC.SN1", "ACTTBL.SN1" (also ACTTBL_E.SN1" and "ACTTBL_N.SN1", because Jam has different difficulty settings), "DATA.SN1", and "TBL.SN1". "AC" is the game code, "DATA" holds compressed graphics, tilemaps, and stage blocks/chunks. "TBL" holds collision data, uncompressed graphics, and stage layouts (including special stages), and "ACTTBL" holds object layouts. I then took a look at the disassembly, and I noticed that the padding between sections corresponds to how the game was split up in Jam. "DATA" starts off with the Sega Logo graphics, and in the original ROM, you can see the padding placed before said graphics.
The last piece of data in the "DATA" file is the graphics for the logo in the ending, and look at that, another piece of padding right after it in the original ROM:
After this bit of padding is the stage collision data, which so happens to be the "TBL" section
The last bit of data in the "TBL" file is the graphics for the special stage ring, and in the original ROM, you can see another bit of padding placed after it:
And after that are the stage object layouts, aka "ACTTBL":
The "ACTTBL" file ends with the last object layout, and, of course, in the original ROM, after that is some more padding:
Which is then followed by the sound driver after. So, based on this info, you can see that the original game had multiple ROM sections: the code ("AC"), compressed graphics, tilemaps, and stage blocks and chunks ("DATA"), stage collision, layouts (both regular and special stages), and uncompressed graphics ("TBL"), stage object layouts ("ACTTBL"), and then finally the sound driver and data. |
I'll grant that when Nemesis introduced the concept of the split disassembly it was a big step forward, but between seeing things like the work pret has done and knowing from the Nick Arcade prototype that Sonic Team's code made liberal use of crossreferences and symbols exports, I'm forced to wonder why Sonic ROM hacking persists in working with a single giant text file. Look at pokecrystal, for example - the main file
main.asm
is only 20.3 kilobytes and rarely needs to be touched because all the code is separated out into other files for ease of reference and crossreference. By contrast,sonic.asm
is 237 kilobytes, and it gets worse the more featureful the games get -s2.asm
is 2.54 megabytes,sonic3k.asm
is 4.25 megabytes.The text was updated successfully, but these errors were encountered: