AngrTracerError: Could not step to the first address of the trace - state split #81

TheBlueMatt · 2019-12-11T03:10:04Z

When attempting to test against a real (albeit incredibly simple - just a tiny message deserialization test, though the same happens on much more complicated targets too) Rust target, after cle loads and does a run, angr gets mad immediately upon calling simgr.use_technique(t) (see stack trace below).

WARNING | 2019-12-11 03:05:52,461 | cle.loader | The main binary is a position-independent executable. It is being loaded with a base address of 0x400000.
WARNING | 2019-12-11 03:05:54,099 | cle.loader | The main binary is a position-independent executable. It is being loaded with a base address of 0x400000.
Traceback (most recent call last):
  File "run_driller.py", line 68, in <module>
    main()
  File "run_driller.py", line 55, in main
    for _, new_input in Driller(binary, seed).drill_generator():
  File "/root/driller/venv/lib/python3.7/site-packages/driller/driller_main.py", line 101, in drill_generator
    for i in self._drill_input():
  File "/root/driller/venv/lib/python3.7/site-packages/driller/driller_main.py", line 131, in _drill_input
    simgr.use_technique(t)
  File "/root/driller/venv/lib/python3.7/site-packages/angr/sim_manager.py", line 188, in use_technique
    tech.setup(self)
  File "/root/driller/venv/lib/python3.7/site-packages/angr/exploration_techniques/tracer.py", line 192, in setup
    raise AngrTracerError("Could not step to the first address of the trace - state split")
angr.errors.AngrTracerError: Could not step to the first address of the trace - state split

The text was updated successfully, but these errors were encountered:

rhelmot · 2019-12-11T03:12:18Z

This is generally because the binary or one of its shared libraries uses input which is not concretized before reaching the entry point. Can you post a testcase to reproduce this, including all the dependent shared objects?

TheBlueMatt · 2019-12-11T03:14:26Z

Sure! ldd output is below but give me a sec and I can push something that you can easily cargo build.

root@fuzzer:~/driller# ldd ./rust-lightning/fuzz/target/release/msg_ping_target
	linux-vdso.so.1 (0x00007fff58bef000)
	libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007fa05dd2d000)
	libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007fa05dd0c000)
	libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007fa05dcf2000)
	libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fa05db31000)
	/lib64/ld-linux-x86-64.so.2 (0x00007fa05dd66000)

rhelmot · 2019-12-11T03:15:06Z

I don't have a rust compiler installed and don't want to, can you just zip and upload the binaries

TheBlueMatt · 2019-12-11T03:17:37Z

Sure. The simplest binary is at http://web.bluematt.me/msg_ping_fuzz_target_for_driller

rhelmot · 2019-12-11T03:19:18Z

You need to provide a full testcase, that is, your script and all the inputs, including all the shared libraries, since one of the inputs will be a trace including shared object addresses.

TheBlueMatt · 2019-12-11T03:25:14Z

Simplified input and simplified crash-demonstrating script at bug81.tar.gz

Running against dependencies as installed with

pip install cle angr archinfo
pip install git+https://github.com/angr/tracer.git#egg=tracer
pip install git+https://github.com/shellphish/driller

All the dependencies are Debian buster system dependencies (
sysdeps.tar.gz if you dont have them handy)

rhelmot · 2019-12-11T03:46:19Z

The program is attempting to use sigaction (among other things) in the initializers before the entry point. Because we can't correctly identify sections of the trace which correspond to initializers, we have to run through these blind, without knowing how to resolve branches. Our simulated sigaction syscall is just a stub, meaning it provides a symbolic return value, so any attempt to branch on its return value will split the state.

To fix this, we will either need to a) implement the sigaction syscall in our environment model or b) implement trace following for shared library initializers.

TheBlueMatt · 2019-12-11T04:24:14Z

Right, looks like rust has some insanity to hook signals and try to print crap if you hit the stack guard page... https://github.com/rust-lang/rust/blob/d8bdb3fdcbd88eb16e1a6669236122c41ed2aed3/src/libstd/sys/unix/stack_overflow.rs#L64 I’m happy to take a crack at whatever you think is the right direction here (I mean obviously just turning this into a “success”-returning noop is fine in this case) if you point me in the right direction.

…

On Dec 10, 2019, at 22:46, Audrey Dutcher ***@***.***> wrote: The program is attempting to use sigaction (among other things) in the initializers before the entry point. Because we can't correctly identify sections of the trace which correspond to initializers, we have to run through these blind, without knowing how to resolve branches. Our simulated sigaction syscall is just a stub, meaning it provides a symbolic return value, so any attempt to branch on its return value will split the state. To fix this, we will either need to a) implement the sigaction syscall in our environment model or b) implement trace following for shared library initializers. — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.

rhelmot · 2019-12-11T05:10:06Z

For the sigaction route: assuming you know how to manipulate basic angr objects the docs pages you want are:

Sigaction will most certainly not be the last piece you run into which causes issues. In order to tell what the problem causing a split is, you should get a postmortem pdb shell at the crash site and examine state.solver.constraints[-1] for one of the split states. It will include a variable whose name is hopefully descriptive enough of where it came from.

For the initializer tracing route: the tracer code is here: https://github.com/angr/angr/blob/master/angr/exploration_techniques/tracer.py

What needs to happen is the part in setup commented as "step to entry point" needs to be removed and replaced with something more similar to the part commented as "calc ASLR slide for main binary and find the entry point", for each initializer (project.loader.initializers). Then, we need to store a list which allows you for a given initializer to determine which index in the trace corresponds to it. Then, in _update_state_tracking, we need to add a clause that checks to see if we just ran the LinuxLoader SimProcedure, and if so figure out which initializer (or the entry point) we're about to jump into, and adjust the current trace index appropriately.

TheBlueMatt · 2019-12-11T22:02:20Z

Well, I went the "easy" route and instead exported the rust code in a static library and called it from a C wrapper, which got past it, sorry for the lack of useful conribution. Am now getting what appears to be #80.

clampz · 2020-07-10T23:31:04Z

For the sigaction route: assuming you know how to manipulate basic angr objects the docs pages you want are:

https://docs.angr.io/extending-angr/simprocedures

https://docs.angr.io/extending-angr/environment

Sigaction will most certainly not be the last piece you run into which causes issues. In order to tell what the problem causing a split is, you should get a postmortem pdb shell at the crash site and examine state.solver.constraints[-1] for one of the split states. It will include a variable whose name is hopefully descriptive enough of where it came from.

For the initializer tracing route: the tracer code is here: https://github.com/angr/angr/blob/master/angr/exploration_techniques/tracer.py

What needs to happen is the part in setup commented as "step to entry point" needs to be removed and replaced with something more similar to the part commented as "calc ASLR slide for main binary and find the entry point", for each initializer (project.loader.initializers). Then, we need to store a list which allows you for a given initializer to determine which index in the trace corresponds to it. Then, in _update_state_tracking, we need to add a clause that checks to see if we just ran the LinuxLoader SimProcedure, and if so figure out which initializer (or the entry point) we're about to jump into, and adjust the current trace index appropriately.

Hey @rhelmot is there more you can say about the loop that the part commented as "step to entry point" needs to be replaced with? Sorry! I'm just sitting here trying to figure out how i can use driller for my project and ran into this same error. I would like to write this fix you're talking about and see if it works for me. I'm just a bit confused what we're looking for with this loop .. is it to create the list you mentioned? or something else? any more info or thoughts would likely help me .. thanks!

rhelmot · 2020-07-10T23:41:58Z

Yes, the goal is to create the list - a list which indicates that for the nth initializer, its presence in the trace starts at the specified index.

So for example the trace looks like this

---------------------------------------------------------------------
   ^initiializer 1        ^initializer 2                     ^entry point

And what we want to find out is what indices correspond to each of those points so we can correctly keep track of where in the trace our execution corresponds to when we're executing with angr's simplified model of running initializers (the LinuxLoader simprocedure).

We already use heuristics to determine the entry point trace index, we just need to do the same thing to figure out where the initializers are, too. The result of that computation will be a list of trace indices, one corresponding to each initializer. Then, we need to store that list and do the thing I described earlier (only 7 months ago? yikes) where we use it to update the current index while executing.

clampz · 2020-07-14T00:07:09Z

Thanks for the guidance @rhelmot ! I've gotten started, and I think I get what you mean. I'm not finding any of the project.loader.initializers addresses in _trace though! When I look through the symbols in my binary for stuff labeled init I am able to find it in _trace but they arent in the list of initializers previously mentioned. Not sure what I'm missing here, it doesn't seem right to me .. maybe I'm over-simplifying this programming problem, or using a poor example binary. I was just using the binary I am currently fuzzing, which is a wrapper program around some poppler api calls.

I've attached my files so you can see what I'm working on. Lemme know if I missed something or if you need more information to see what's going on for me. Thanks for all your help

driller-testcase-clampz.pt1.tar.gz
driller-testcase-clampz.pt2.tar.gz
driller-testcase-clampz.pt3.tar.gz
driller-testcase-clampz.pt4.tar.gz
driller-testcase-clampz.pt5.tar.gz

[edit] - my archive was too big so i just split into pieces, you should be able to just cat them together
[edit2] - for context my binary was compiled with -no-pie -g
[edit3] - i guess maybe my question is, addresses from qemu trace and from cle are not matching up, seems like maybe thats to be expected .. but how might you go about resolving this so we can match up our initializer addresses and trace addresses? my thoughts are i need a memory map for the process loaded in qemu then i need to rebase all the addresses in the list of initializers based on that memory map - i tried using a qemu option to set the base address to the same one used in cle but it remained the same

rhelmot · 2020-07-14T04:55:05Z

The point of the algorithm I referenced which identifies the entry point (the find the entry point comment) is that it works regardless of the base address used by qemu, by assuming that the page-alignment must be the same and that the block prior to it will be very far away, a jump from a different mapped image.

TheBlueMatt mentioned this issue Dec 11, 2019

More Fully Test On-Chain Failure lightningdevkit/rust-lightning#385

Open

rhelmot added the help wanted label Dec 11, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AngrTracerError: Could not step to the first address of the trace - state split #81

AngrTracerError: Could not step to the first address of the trace - state split #81

TheBlueMatt commented Dec 11, 2019

rhelmot commented Dec 11, 2019

TheBlueMatt commented Dec 11, 2019

rhelmot commented Dec 11, 2019

TheBlueMatt commented Dec 11, 2019

rhelmot commented Dec 11, 2019

TheBlueMatt commented Dec 11, 2019 •

edited

Loading

rhelmot commented Dec 11, 2019

TheBlueMatt commented Dec 11, 2019 via email

rhelmot commented Dec 11, 2019

TheBlueMatt commented Dec 11, 2019

clampz commented Jul 10, 2020

rhelmot commented Jul 10, 2020

clampz commented Jul 14, 2020 •

edited

Loading

rhelmot commented Jul 14, 2020

AngrTracerError: Could not step to the first address of the trace - state split #81

AngrTracerError: Could not step to the first address of the trace - state split #81

Comments

TheBlueMatt commented Dec 11, 2019

rhelmot commented Dec 11, 2019

TheBlueMatt commented Dec 11, 2019

rhelmot commented Dec 11, 2019

TheBlueMatt commented Dec 11, 2019

rhelmot commented Dec 11, 2019

TheBlueMatt commented Dec 11, 2019 • edited Loading

rhelmot commented Dec 11, 2019

TheBlueMatt commented Dec 11, 2019 via email

rhelmot commented Dec 11, 2019

TheBlueMatt commented Dec 11, 2019

clampz commented Jul 10, 2020

rhelmot commented Jul 10, 2020

clampz commented Jul 14, 2020 • edited Loading

rhelmot commented Jul 14, 2020

TheBlueMatt commented Dec 11, 2019 •

edited

Loading

clampz commented Jul 14, 2020 •

edited

Loading