Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

crash file minimizing #305

Closed
Byter09 opened this issue Mar 2, 2020 · 9 comments
Closed

crash file minimizing #305

Byter09 opened this issue Mar 2, 2020 · 9 comments

Comments

@Byter09
Copy link

Byter09 commented Mar 2, 2020

I read through this #195 and was very happy when https://github.com/rust-fuzz/honggfuzz-rs was updated to allow me to minimize the input files, though something very essential to a lot of people (including me), is to also minimize the crash files. The easiest method would be the following:

  1. if a new crash file is generated, look if there is already one
  2. if the input is smaller than the contents of that file, replace the contents

This would also resolve rust-fuzz/honggfuzz-rs#26.

I hope my view is not too naive and that this can be realized without much overhead.

Thanks for taking the time to read this. :)

@evanrichter
Copy link

An easy way that I've found to do this is to copy the crash files over to your input folder, and then run the minimization, or just specify your crashes folder as your input folder when minimizing

@Byter09
Copy link
Author

Byter09 commented Mar 2, 2020

Yes that's what I was doing too, but it's really not what you should do in the CI to have continuous minimization.

@evanrichter
Copy link

just wondering, why would you want continuous minimization? I see it as a tool to pare down the corpus when triaging, but not necessarily useful while fuzzing

@Byter09
Copy link
Author

Byter09 commented Mar 3, 2020

The reason for continuous minimization is pretty simple: I don't want to minimize inputs myself and would rather work with less noise. Of course I could just take the crash files, put them into gdb and see what's crashing. But that's not the point.

The thing is that I have honggfuzz running continuously on a server for several projects to build up a huge corpus, for which corpus minimization was already a big(!) help, but when the example it finds is 800 bytes instead of 12, I would rather like to immediately see the problem or write a unit test instead of minimizing it myself which can take a lot of time. You can imagine how bad this becomes once you work with bigger minimum sized payloads.

Also, I'm mostly interested in a feature like this:
https://altsysrq.github.io/proptest-book/proptest/tutorial/shrinking-basics.html

Basically, as soon as it finds something, it'll shrink it down to something that can easily be tested for in unit tests.

So to summarize, I'm fine with honggfuzz being what it is today, but these two steps I proposed are so simple one just has to ask themself, why is this not already implemented? AFL and libFuzzer both support this (well at least the Rust crates that use these tools, not sure about the tools themself).

Should I have the wrong impression of what honggfuzz can/should do and all my points above are invalid, then my last argument is this: I'm lazy af.

@evanrichter
Copy link

I see your point! But personally I think it would best be implemented in your own workflow scripting, perhaps with a cronjob or with something like watchexec that watches the crashes folder.

The way I see it, is that honggfuzz is primarily a fuzzer. It should focus on finding as many unique crashes as it can. That said, it does have some helper utilities, like the minimization pass. How we use that utility, though, is up to us.

It's my belief (and I may be wrong here!) is that minimizing crashes before feeding them back into the seed pool would likely be detrimental to finding more unique crashes. Therefore, if I wanted to implement your workflow, I would have a script that:

  1. runs every so often (or the wonderful watchexec tool)
  2. copies over new crashes to the minimized folder
  3. minimize those crashes in that folder

If I understand your desired workflow, that should satisfy the requirements, without needing to add features to honggfuzz tailored to a specific use case. Though that idea may be welcome to the maintainers, and I'd be okay with that too

@Byter09
Copy link
Author

Byter09 commented Mar 3, 2020

The problem with this is, that you'd also have to use --save_all to save every crash.

Imagine this:
Honggfuzz finds a new input blargh which raises the coverage so it gets saved in the corpus dir. Now this input also causes a crash, which results in a new crash file.
Now imagine a new input is being generated: bla. It does not raise coverage but causes the same crash. It basically gets thrown away.

What you'd have to do to still save it, is to apply --save_all. But you can imagine this also saves bigger inputs that caused the same crash. So it's only a partial solution to the problem with a way bigger IO overhead overall.

So to allow finding smaller inputs of the same crash you'd have two solutions:

  1. --save_all equivalent that only saves smaller ones than already found. (this would then be coupled by the "move all to corpus and run minimizer" manual task) or
  2. apply the initial idea of this issue and simply rewrite the crash file.

Number 2 is way less "weird" and can run in the CI with a final folder of automatically minimized crash files without having to run the corpus minimizer.

(What I'm trying to say is, that I don't mind the "hack" of moving crashes to the corpus and running the minimizer. That's not my problem. The problem is that even if I wanted to do that, I'd have to enable --save_all which will produce thousands of, yes, potentially smaller, but most probable way more equal sized and bigger files)

douglasbagnall added a commit to catalyst/honggfuzz that referenced this issue Feb 5, 2021
While --save_all can be used to save all crashes, often people only
really want to find the smallest possible crash file.

This patch adds a --save_smaller option that overwrites the crash file
when it would shrink the file size.

See google#305
douglasbagnall added a commit to catalyst/honggfuzz that referenced this issue Feb 5, 2021
While --save_all can be used to save all crashes, often people only
really want to find the smallest possible crash file.

This patch adds a --save_smaller option that overwrites the crash file
when it would shrink the file size.

See google#305
douglasbagnall added a commit to catalyst/honggfuzz that referenced this issue Feb 9, 2021
While --save_all can be used to save all crashes, often people only
really want to find the smallest possible crash file.

This patch adds a --save_smaller option that overwrites the crash file
when it would shrink the file size.

See google#305
@douglasbagnall
Copy link
Contributor

Not quite fixed in PR #379; see also PR #380.

@Byter09
Copy link
Author

Byter09 commented Feb 10, 2021

Nice. Can't wait to see that in cargo-hfuzz :D

@robertswiecki
Copy link
Collaborator

Spring cleaning, please re-open if needed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants