Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Follow up on fuzzing reports #3800

Closed
terriko opened this issue Feb 8, 2024 · 14 comments
Closed

Follow up on fuzzing reports #3800

terriko opened this issue Feb 8, 2024 · 14 comments
Labels
security public security-related issues.

Comments

@terriko
Copy link
Contributor

terriko commented Feb 8, 2024

@crazytrain328 has been working on a bunch of new fuzzers for cve-bin-tool's language parsers, and @mastersans set up a regular fuzzing job that runs them and sees what it finds.

And it's definitely finding some things:
https://github.com/intel/cve-bin-tool/actions/workflows/fuzzing.yml

It's a bit hard to tell because the fuzzing jobs always fail -- if they don't find something then the timeout eventually marks the job as a failure. But if you look, the ones that didn't find anything all quit after an hour, and the ones that did find something tend to have much shorter runtimes. Here's one:

https://github.com/intel/cve-bin-tool/actions/runs/7780736331/job/21213930431

What I'd like:

  1. Someone to read through and see which of these look "interesting" and try to reproduce them. (The fuzzing job has only run 15 times so there's only a few things likely to be interesting right now.)
  2. File issues for any findings that aren't network timeouts or other unrelated errors.
  3. File issues or make fixes to the automated fuzzing job itself if you can think of ways to make it easier to reproduce and/or file issues related to fuzzing findings.
  4. Fix any issues you can fix.
  5. Try running the fuzzers in your own local environment to see what else you can find.

I don't know that this is a good first issue but it could be a good follow-up issue for those of you who've already got a few PRs done and want something more complicated to work on that'll give you an excuse to look at different pieces of the code.

@terriko terriko added the security public security-related issues. label Feb 8, 2024
@inosmeet
Copy link
Contributor

inosmeet commented Feb 9, 2024

I'll try this one.

@joydeep049
Copy link
Contributor

joydeep049 commented Feb 9, 2024

Thank you @Dev-Voldemort for taking this up. It would help a lot in identifying the faults in the daily and weekly CI.
I would like to add that it would be great if someone could take this issue up as this is only parser request left to handle.
I am working on the Debian Parser issue and am almost done with it , just need @terriko to review it once.
Once these two parser requests are done, we could then setup fuzz testing for both of these as well.
That way the fuzz reports would be more complete.

@tahifahimi
Copy link
Contributor

Thank you @Dev-Voldemort for taking this up. It would help a lot in identifying the faults in the daily and weekly CI. I would like to add that it would be great if someone could take this issue up as this is only parser request left to handle. I am working on the Debian Parser issue and am almost done with it , just need @terriko to review it once. Once these two parser requests are done, we could then setup fuzz testing for both of these as well. That way the fuzz reports would be more complete.

I can help with this one, since no one is assigned.

@inosmeet
Copy link
Contributor

Thank you @Dev-Voldemort for taking this up. It would help a lot in identifying the faults in the daily and weekly CI. I would like to add that it would be great if someone could take this issue up as this is only parser request left to handle. I am working on the Debian Parser issue and am almost done with it , just need @terriko to review it once. Once these two parser requests are done, we could then setup fuzz testing for both of these as well. That way the fuzz reports would be more complete.

Glad to be of help :)
As for the remaining parser, @tahifahimi seems interested

@inosmeet
Copy link
Contributor

I'm new to fuzzers, so I don't know how an OK fuzzer should be working.

After reviewing some errors, and applying some nits, I'm getting this:

Screenshot from 2024-02-10 20-42-30

Does this mean it's working?
@terriko @crazytrain328

@joydeep049
Copy link
Contributor

I'm new to fuzzers, so I don't know how an OK fuzzer should be working.

After reviewing some errors, and applying some nits, I'm getting this:

Screenshot from 2024-02-10 20-42-30

Does this mean it's working? @terriko @crazytrain328

Yes, This is how fuzzing reports come in.
You are going in the right direction :)

@inosmeet
Copy link
Contributor

Making the responsible functions asynchronous seems to resolve the errors, but I'm not sure if that's the correct way to go.

@joydeep049
Copy link
Contributor

Making the responsible functions asynchronous seems to resolve the errors, but I'm not sure if that's the correct way to go.

Are you sure? Making all those functions asynchronous would require all of their caller functions to be asynchronous as well, or they have to be called in an asyncio loop (basically we'd have to change a lot of code :).. )

@mastersans
Copy link
Member

mastersans commented Feb 12, 2024

we generally do fuzz testing to check that weather our functionality is robust to different inputs(thats why we feed random data/unexpected inputs here in hopes to find something on which our functionality crashes) and probably using that information to improve it, so I am not sure how making the functions async would help can you elaborate further, also i thinking async have totally different use case.

@terriko
Copy link
Contributor Author

terriko commented Feb 13, 2024

Does this mean it's working? @terriko @crazytrain328

As others have said: yes, that looks like it's working!

Fuzzing means basically "throw garbage data at this and see if it crashes." Sometimes your data has an expected format and sufficient error checking that it'll reject anything that doesn't fit that format (e.g. reject all things that aren't json, reject all things that don't include {vendor, product}) so you don't want to waste cycles generating lots of data hoping eventually that you'll get valid json so you can find more "interesting" bugs. So in /fuzz we've got a bunch of specialized code that generates trash data with the right formats and shoves it in to the right functions.

When you (or our CI system) runs one of those fuzzers, it's going to keep making superficially correct data filled with trash until "something interesting" happens. If it gets a crash or a slowdown, it should dump some information about what input it used to make that happen so that you can try it again and figure out what went wrong. Most of the time it's finding places where we didn't do perfect input validation, so it'll be stuff like "this field had a % in it and we should only allow letters and numbers" that should be pretty easy to fix. Every once in a while it'll find something that might be a viable security exploit ("if you put html here it'll show up in the final report!") but a lot of it should be fairly simple and mundane.

When run in CI we're currently running for an hour and then timing out, so the jobs that run for around an hour will likely only show "we're making garbage!" lines like you screenshotted and then it'll spit out a line about a keyboard interrupt or something (the CI timeout seems to do the equivalent of hitting ^C ). The interesting jobs will be the ones where it says something about a crash.

Note that an hour is really short for a fuzzing campaign -- it'd be normal to run a fuzzer for days or weeks. But I didn't want to abuse the Github Actions resources so our CI job right now is meant mostly to make sure that the fuzzers work (even in the face of other code changes going on) and help us get started on any easy findings before we start looking at longer runs.

BTW, I think the tooling we're using is smart enough to "start where it left off" in garbage generation but we might need to make sure we're caching whatever it needs correctly so it can do that. I don't know offhand if that's set up yet so it might be a thing to look into. We can also run the fuzzing job more often if at least one person actively wants to work on it, or you can run it yourself on a fork to play around.

@inosmeet
Copy link
Contributor

also i thinking async have totally different use case.

well, that's the interesting part, async has a totally unrelated use case, yet just marking a function with async(I'm not even changing it's call) suppresses the errors and the fuzzer behaves as in working condition. At first glance, it seems some kind of blocking issue. I'll dig more when I find time.
Let me know what you think.

@inosmeet
Copy link
Contributor

Making the responsible functions asynchronous seems to resolve the errors, but I'm not sure if that's the correct way to go.

Are you sure? Making all those functions asynchronous would require all of their caller functions to be asynchronous as well, or they have to be called in an asyncio loop (basically we'd have to change a lot of code :).. )

Exactly, IMO making them asynchronous shouldn't resolve/suppress the errors, yet they are.

Issue at hand is related to some validation nits, I'll make PRs soon, but I'm confused about this async behavior.

@terriko
Copy link
Contributor Author

terriko commented Feb 14, 2024

also i thinking async have totally different use case.

well, that's the interesting part, async has a totally unrelated use case, yet just marking a function with async(I'm not even changing it's call) suppresses the errors and the fuzzer behaves as in working condition. At first glance, it seems some kind of blocking issue. I'll dig more when I find time. Let me know what you think.

As a completely uninformed guess: the fuzzer does look for slowdowns and treats them as errors, so the async may be a valid fix here in that it unblocks a potential choke point, or it may be disguising places where we should be improving regexes or data validation to avoid slowdowns in the first place. Can't know without looking at exactly what's going on!

@terriko
Copy link
Contributor Author

terriko commented Apr 17, 2024

The initial fuzzing reports here have been examined and issues fixed. I'm going to close this, but we likely will have new issues to examine especially since we've added so many new fuzzing options.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
security public security-related issues.
Projects
None yet
Development

No branches or pull requests

5 participants