-
Notifications
You must be signed in to change notification settings - Fork 281
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Shut down if all initial inputs crash #276
base: master
Are you sure you want to change the base?
Shut down if all initial inputs crash #276
Conversation
If a Fuzz function crashes on every initial input provided, the fuzzer still executes and posts stats to stdout. However, since no inputs are provided to the workers, nothing meaningful happens. The behaviour is confusing because of two points: 1. No proper logging is done. The coordinator continues to execute and posts stats. 2. Crashing inputs in general are allowed. The problem arises only if all initial inputs crash. The problem in more detail: The coordinator maps the initial corpus into memory. If no corpus is provided the coordinator creates a default input (the empty input). Workers then access the corpus over the hub, not over the coordinator. Before this happens they are given the initial corpus files and are testing them. If the Fuzz function crashes on these inputs, they will never be passed to the hub. Therefore they are not available to the workers after this initial stage. The proposed solution is as follows: After the initial triage stage we wait a short amount of time for synchronization purposes. Then we check if the corpus is still empty. If it is we conclude that the target crashed on every input and shut down the fuzzer with the respective error message.
I tried out your patch, and things still stops processing, but quite randomly. However the more workers I run, the more likely it seems to be that there is a lockup. Weird. Restarting, does sometimes allow it to progress. No critique, but I am not this this fixes all cases of #262 3 executions:
In some of the cases where execution stops one worker is running at full speed, but nothing appears to happen. |
Thank you for the feedback. Can you share the fuzz function you used? I will try to reproduce this tomorrow and look further into it. |
Using verbose output it ends up with something like this:
Adding more workers (I have 32 threads) seems to increase the chance of a deadlock. And yes, the code does panic fairly often... The tested package is commit 56fc5c4ff6bb831811b4640aecdb998edebe545a (currently master) |
The break statement accidentally slipped into the wrong block.
Looking into the CI results, the errors do not stem from these changes, do they? |
// If the corpus is empty at this point, we have nothing to feed to our workers. | ||
if x == 1 { | ||
// Give the hub time to store the initial inputs | ||
time.Sleep(100 * time.Millisecond) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I am reading this correctly, we can fail spuriously if hub is delayed a bit (overloaded machine, virtualization, etc).
Also we don't generally ask user to add anything to corpus, fine to start with an empty one. Also if just part of inputs crash, we will continue working, this radical difference in behavior between part crash and all crash feels wrong. Also the corpus may contain, say, just 1 input and it happened to crash after code update. Then fuzzer won't start.
I agree the current behavior is bad. But I think a more useful way to fix this would be to let workers work even with empty corpus. They could generate random completely random inputs, it does not seem we can do anything better in absence of any seeds.
Then if just part of inputs crash (and we were unlucky with corpus, I think I saw cases where the initial empty input crash, maybe due to a bug in the fuzz function itself), the fuzzer will gracefully recover. Otherwise, the crash count will go up infinitely, which is, well, the expected behavior in such case.
It may be a bit tricky to preserve this part then:
if len(ro.corpus) == 0 {
// Some other worker triages corpus inputs.
time.Sleep(100 * time.Millisecond)
continue
}
because we want to wait for hub for initial inputs, but not if there are really 0 inputs. I think this bit has to do with corpus inflation problem mentioned above:
// Other workers are still triaging initial inputs.
// Wait until they finish, otherwise we can generate
// as if new interesting inputs that are not actually new
// and thus unnecessary inflate corpus on every run.
time.Sleep(100 * time.Millisecond)
But I hope there is some way to preserve it.
Yes, the CI failure looks unrelated. There seems to be some change in darwin setup on travis. |
This should be fixed now: be3528f |
Regarding #262
The problem:
If a Fuzz function crashes on every initial input provided, the fuzzer
still executes and posts stats to stdout. However, since no inputs
are provided to the workers, nothing meaningful happens.
The behavior is confusing because of two points:
posts stats.
all initial inputs crash.
The problem in more detail:
The coordinator maps the initial corpus into memory. If no corpus is
provided the coordinator creates a default input (the empty input).
Workers then access the corpus over the hub, not over the coordinator.
Before this happens they are given the initial corpus files and are
testing them. If the Fuzz function crashes on these inputs, they will
never be passed to the hub. Therefore they are not available to the
workers after this initial stage.
The proposed solution:
After the initial triage stage we wait a short amount of time for
synchronization purposes. Then we check if the corpus is still empty.
If it is we conclude that the target crashed on every input and
shut down the fuzzer with the respective error message.
Please comment if the solution lacks in anything.