-
Notifications
You must be signed in to change notification settings - Fork 47
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Switch function traversal from "breadth first" to "depth first" #14
Comments
Related to function boundary detection, some inspiration may be sought from rev.ng Talk at https://youtu.be/5CbuU4KwBCE From slides at http://llvm.org/devmtg/2016-11/Slides/DiFederico-rev.ng.pdf The function detection process
Skipping jump instruction in this case refers to a jump instruction (think tail call), which jumps over the basic blocks of another function. |
@mewmew , Thanks, I noticed rev.ng, though was surprised (not really) that instead of being "suite of tools for static binary analysis", it's currently just a binary translator. Didn't see those slides, looked thru. Well... I saw a similar paper recently (https://syssec.mistakenot.net/papers/eurosp-2017.pdf), that one caught my attention with a title "Compiler-Agnostic Function Detection in Binaries". "Compiler-agnostic, really? How else could it be?" So well, both that paper and rev.ng's try to deal with detection of all functions in a binary. Well, good (luck). That's something completely different than what ScratchABit does - it's purposed is to be a dumb tool, and detect only what guaranteedly can be detected. Beyond that, everything is offloaded to a human (which can do further analysis manually, or using heuristic plugins). The topic of this ticket is how to do that automatic detection in a better way. Depth-first was implemented btw, I don't close this ticket because I'd like to review if there're some TODOs still left. On related note, I recently did switch basic blocks in ScratchABlocks to end on a call (pfalcon/ScratchABlock@1446a83), following the idea that in machine code, a call is much closer to a jump, than to a mathematical definition of function. That's of course how even Cifuentes suggested to do it, I just still find that other people like the idea that call == function too, e.g. http://x64dbg.com/blog/2016/07/27/Control-flow-graph.html |
pfalcon/ida-xtensa2#1 represents an issue with the current code traversal process of SAB: it tries to follow all code paths of the current function, before considering calls the current function makes. Unfortunately, some calls are effectively jumps and never return. Trying to decode bytes after such a call may lead to conflicts in byte types, which may affect decoding of other functions following such instructions. So, instead, should try to emulate how a real CPU does - do a depth-first traversal of as many functions as possible first, and consider return addresses from calls only after that. This should allow to detect natural stop-gaps for unconstrained and incorrect code propagation. E.g. if following a call instruction there's a literal data belonging to another function, then it doesn't make sense to try to interpret this data as an instruction.
This approach will however have problems with function boundary detection (such detection was the reason to adopt breadth-first scheme). So, it will need to be reworked.
The text was updated successfully, but these errors were encountered: