-
-
Notifications
You must be signed in to change notification settings - Fork 373
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Match results show 100% equal for functions with differences #313
Comments
Digging into this a bit more I got the "best" matches to run by changing In the diaphora.py file I found the So it seems like as long as two functions have the same name, address, size, and control flow they get reported as 100% equal even though the instructions in the functions could have changed? Is this intended behavior or a bug? |
This is intended behaviour. But according to the very detailed report you made, it might be wrong. I'm going to add the patch you did (adding |
I have attached the sample files and IDC scripts used to reproduce the IDA databases I had setup. You can load each .bin file as "PowerPC big-endian", use default memory layout settings, if asked analyze as 32-bit, use all default settings for IO ports, etc. Please let me know if you have any other questions or issues loading the samples. |
Bug fixed locally, waiting for all the tests to pass. Thanks a lot! |
ML: Dropped support for training local models. They were not working properly at all. BUG: HEUR: Added field 'bytes_hash' to the '100% equal' heuristic, as it was ignoring some minimal changes (issue #313) BUG: HEUR: Always check if there are differences even for structurally 100% equal databases (issue #313).
I'm trying to use diaphora to diff different versions of the same binary and detect functions that have differences with a granularity of single instruction changes. I noticed when diff'ing two versions of this binary that only contain a single instruction difference the match of the function is detected as "100% equal" with a ratio of 1.0 even though the functions contain a single instruction difference.
If I diff the assembly for the functions I can see the single instruction change:
I understand the "lwz" line is a false positive because I changed the immediate display type in one of the databases before exporting, but I would still expect the slwi/sldi instruction change to get detected. Is there some settings I can change for the comparisons to be more strict? I thought some of the heuristics used the MD5 hash of the function data which I would expect to change between these two functions.
For additional confirmation I diff'd the two binaries in a hex editor and can clearly see the 4 byte change for the different instructions:
The text was updated successfully, but these errors were encountered: