-
Notifications
You must be signed in to change notification settings - Fork 17.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
proposal: cmd/go: make fuzzing a first class citizen, like tests or benchmarks #19109
Comments
I think it would be easier to evaluate the idea if it were slightly less abstract. For example:
|
@ianlancetaylor, yes, I think the first step before it's designed completely is to determine whether there's interest. |
As a general concept, I'm in favor. |
I would expect that there would be an additional required flag (when fuzzing) where you specify the corpus directory. |
Can we just cache the corpus somewhere under |
I think it's wrong to think of the corpus as strictly a cache. The corpus is the save state of the fuzzer and the documentation for go-fuzz even recommends committing them into a version control system. The A specified corpus is not so much for the user modify the corpus themselves, but for them to specify how to persist the corpus data. |
Could there be some default convention say a _fuzz/xxx directory (where xxx corresponds with FuzzXxx) and a method on the *testing.F object to load a different corpus from the _fuzz/ directory if necessary? It seems like it should just know where the corpus is. |
I'm in favor. Efficient fuzzing usually requires some help from compiler so
it's better to built this into std.
(compiler instrumentation will be more efficient than go-fuzz's source code
instrumentation. I also want to have cmd/cover built on compiler
instrumentation to support branch coverage, but that's off-topic to this
issue.)
How about add some methods to testing.T (or perhaps invent a new
testing.Fuzz to replace testing.T, but see below) in fuzz tests?
One of the method could be setting the corpus location (we can recommend it
to be saved under testdata).
but we probably should also a command line flag to override the test's
setting (one compromise is to make both optional:
Introduce -test.fuzzdir to hold the corpus path for all fuzz tests. If not
provided, default to testdata/fuzz
(*testing.T).FuzzDir("parser") // optional call to set corpus directory
prefix location for this test, if relative, then relative to the
-test.fuzzdir value.
To make the feature more useful, I suggest we still use testing.T so that
it's quite easy to migrate fuzz found test cases into a (table driven)
regular test.
Making fuzz tests taking a testing.T will facilitate this.
|
I use it regularly on a lexer/parser/formatter for Bash (https://github.com/mvdan/sh). Having it be a first-class citizen would simplify things for me and for contributors. |
Found a bug in the C decoder for google/brotli by fuzzing a Go implementation of a Brotli decoder. Also found some divergences in Go bzip2 decoders from the canonical C decoder (this and #18516). All by fuzzing. |
My coworker at DigitalOcean was working on a side project to make fuzzing easier. Check his repo out here: https://github.com/tam7t/cautious-pancake Adding it here as I think it would be a valuable piece of information for this discussion. |
The README for go-fuzz lists a number of "Trophies", ( https://github.com/dvyukov/go-fuzz#trophies ) the majority of which are from the standard library, but about 20% of which are external to the Go standard libraries. A GitHub search for Go source files with the My tutorial on fuzzing ( https://medium.com/@dgryski/go-fuzz-github-com-arolek-ase-3c74d5a3150c ) gets 50-60 "reads" per month (according to medium's stats). |
Feature that would be also important (at least for me) would be ease of turning some selected Fuzz test cases into permanent tests. Simplest way to do it would be exporting the case data in go byte array and calling |
Yes, we've found fuzzing useful in our projects multiple times. Especially sensitive code, the fuzzer will frequently find edge cases that we missed. Encoding, networking, and generally things that depend on user input. I will say that most of the benefit is usually seen in the first tiny bit of fuzzing. There's a pretty strong diminishing returns as you continue to fuzz, at least that's what we've found. |
As you can understand, I am very supportive for this. Traditional testing is just not enough for modern development speeds. I am ready to dedicate some time to work on parts of this. Throwing some ideas onto the table:
func FuzzFoo(f *testing.F) {
var data []byte
f.GetRandomData(&data)
// use data
}
func TestFoo(t *testing.T) {
var data []byte
testing.GetRandomData(&data)
// use data
} This recalls |
That's true to some degree, but not completely. It depends on (1) complexity of your code, (2) rate of change of your code, (3) smartness of the fuzzer engine. If your code is simple and doesn't change, then fuzzer will find everything it can in minutes. However, if your code change often, you want to run fuzzing continuously as regression testing. If your code is large and complex and fuzzer is smart enough, then it can manage to uncover complex bugs only after significant time. |
To confirm @dvyukov in #19109 (comment) , it would be really nice to have supported types other than []byte. We found bugs in both the gonum/blas implementation and the OpenBLAS library using fuzzing. It's possible to use go-fuzz, but it's kind of a pain to parse the []byte directly, (https://github.com/btracey/blasfuzz/blob/master/onevec/idamax.go). |
Suggest it goes under the subfolder testdata. Then any tools that ignore tests will also ignore this dir. |
I have concerns about how much time this is going to add to testing. My experience with fuzzing is that compiling with the fuzz instrumentation takes a significant amount of time. I'm not sure this is something we want to inflict upon every use of |
@dsnet to execute corpus and check if it doesn't fail instrumentation isn't needed. Instrumentation is needed when you want to expend/improve the corpus. |
Should there be a story to make it easy to use external fuzzing engines? |
@Kubuxu, I'm comfortable with running the Fuzz functions as a form of test without special instrumentation, but Dmitry comment suggested running with N random inputs, which implies having the instrumentation involved. |
My 2c (I am utterly ignorant about Go, but have some ideas about fuzzing) There are several major parts in coverage-guided fuzzing as I can see it:
Instrumentation is better to be done in the compiler, this way it's the most efficient and easy to use. The interface must be as simple as possible. For C/C++ our interface (which we use with libFuzzer, AFL, hoggfuzz, and a few others) is:
and the only thing I regret is that the return type is not void.
(again, not confident about Fuzzing engines and the interface should be independent. And, it would be nice to have the new fuzzing engine(s) to behave similar to AFL, libFuzzer, and go-fuzz
Absolutely, see above.
Maybe.
Corpus is not a constant. It evolves as long as the code under test changes, fuzzing techniques evolve, and simply more CPU hours are spent fuzzing. Note: a corpus stored in RCS allows to perform regression testing (w/o fuzzing)
Not too much off-topic.
This is a price worth paying since the corpus often turns out to be a great regression test.
If you don't enable fuzzing instrumentation (which won't be on by default, I think) you won't pay for it. |
A separate topic worth thinking about is fuzzing for equivalence between two implementations of the same protocol. Imagine your code has Then you can fuzz the following target to verify that the two implementations match:
This works pretty well when both things are implemented in Go. |
I love this. And I think a good solution to the corpus location, like
would
Projects that don't commit the corpus could use Actual fuzzing could be controlled by |
Eh. I do get the problem, but still prefer GitHub 🤷 |
BTW, gofuzz now contains an adapter, so you can turn the []byte from go-fuzz into any structured data: https://godoc.org/github.com/google/gofuzz#NewFromGoFuzz |
@lavalamp Using go-fuzz's |
@tv42 Take a look at the implementation-- 8 bytes are reserved for a seed, but the rest of the []byte is interpreted directly as if it were the output of the PRNG. It only actually uses the PRNG if it runs out of bytes :) It's not a perfect system (e.g., not all bits are equally important to the output) but it's good enough that a good fuzzing algo should be able to make it work. |
The PRNG is definitely going to waste cycles generating fuzzing outputs that are not well-targeted. From the go-fuzz perspective, it'd be better to have a way to signal that there aren't enough bytes and return. Right now the best you can do is return -1, but we could define another designated return value to mean "too short", and go-fuzz could use that as a cue to generate longer data. cc @katiehockman for the successor to go-fuzz |
@lavalamp Okay ByteSource is trickier than it seemed. That still sounds like it'll exhaust the input most of the time, and use the PRNG. That'd be better translated into "uninteresting input", or a more explicit signal. Of course, flipping the API around and producing data on-demand, as per the proposal, is a lot better. |
Hi @lavalamp, using a set of bytes from the input as a seed makes it harder for the fuzzer to incrementally build out additional pieces of input after discovering new coverage, and does not play well with go-fuzz's sonar feature for automatically pulling out values from comparisons. Within the current constraints of the dvyukov/go-fuzz and google/gofuzz, one thing you could do is return 0 for numbers once you run out of input bytes and at that point also only generate zero-length slices and strings, which makes the discovered coverage correlate better with the input []byte and generally gives go-fuzz more control. That is the approach that fzgo (a prototype of this proposal) takes for structured fuzzing & fuzzing rich signatures, and I found that made a material difference in fuzzing efficiency. Also, fzgo tries to play well with go-fuzz's sonar. Two sonar-related comments from fzgo's rich signature generation are here and here. |
Thanks for the details! |
Note that #44551 has the latest proposal for adding fuzz testing to Go, so please direct feedback there. |
With #44551 closed (and everything in 1.18), this issue can be closed too, correct? |
This proposal was placed on hold in 2017 with a comment "Putting this on hold until go-fuzz is more like the proposed 'go fuzz'." I've read https://go.googlesource.com/proposal#hold, which says:
I believe what this has been waiting on has been resolved via proposal #44551, which is implemented in the upcoming Go 1.18 release. Moving this back to Active, and I expect it can likely proceed to Declined as Duplicate proposal state next. |
This proposal is a duplicate of a previously discussed proposal, as noted above, |
This change adds a sample Fuzz test function to package tiff, under the gofuzz build tag. The function is based on the tiff/tiff.go code, from github.com/dvyukov/go-fuzz-corpus. Fixes golang/go#30719 Updates golang/go#19109 Change-Id: I78771e9a1bd01651ba6ca421ba41f0c0e95d0c53 Reviewed-on: https://go-review.googlesource.com/c/image/+/167097 Run-TryBot: Dmitry Vyukov <[email protected]> TryBot-Result: Gobot Gobot <[email protected]> Reviewed-by: thepudds <[email protected]> Reviewed-by: Josh Bleecher Snyder <[email protected]> Reviewed-by: Dmitry Vyukov <[email protected]>
This change adds a sample Fuzz test function to package tiff, under the gofuzz build tag. The function is based on the tiff/tiff.go code, from github.com/dvyukov/go-fuzz-corpus. Fixes golang/go#30719 Updates golang/go#19109 Change-Id: I78771e9a1bd01651ba6ca421ba41f0c0e95d0c53 Reviewed-on: https://go-review.googlesource.com/c/image/+/167097 Run-TryBot: Dmitry Vyukov <[email protected]> TryBot-Result: Gobot Gobot <[email protected]> Reviewed-by: thepudds <[email protected]> Reviewed-by: Josh Bleecher Snyder <[email protected]> Reviewed-by: Dmitry Vyukov <[email protected]>
This change adds a sample Fuzz test function to package tiff, under the gofuzz build tag. The function is based on the tiff/tiff.go code, from github.com/dvyukov/go-fuzz-corpus. Fixes golang/go#30719 Updates golang/go#19109 Change-Id: I78771e9a1bd01651ba6ca421ba41f0c0e95d0c53 Reviewed-on: https://go-review.googlesource.com/c/image/+/167097 Run-TryBot: Dmitry Vyukov <[email protected]> TryBot-Result: Gobot Gobot <[email protected]> Reviewed-by: thepudds <[email protected]> Reviewed-by: Josh Bleecher Snyder <[email protected]> Reviewed-by: Dmitry Vyukov <[email protected]>
This change adds a sample Fuzz test function to package tiff, under the gofuzz build tag. The function is based on the tiff/tiff.go code, from github.com/dvyukov/go-fuzz-corpus. Fixes golang/go#30719 Updates golang/go#19109 Change-Id: I78771e9a1bd01651ba6ca421ba41f0c0e95d0c53 Reviewed-on: https://go-review.googlesource.com/c/image/+/167097 Run-TryBot: Dmitry Vyukov <[email protected]> TryBot-Result: Gobot Gobot <[email protected]> Reviewed-by: thepudds <[email protected]> Reviewed-by: Josh Bleecher Snyder <[email protected]> Reviewed-by: Dmitry Vyukov <[email protected]>
This change adds a sample Fuzz test function to package tiff, under the gofuzz build tag. The function is based on the tiff/tiff.go code, from github.com/dvyukov/go-fuzz-corpus. Fixes golang/go#30719 Updates golang/go#19109 Change-Id: I78771e9a1bd01651ba6ca421ba41f0c0e95d0c53 Reviewed-on: https://go-review.googlesource.com/c/image/+/167097 Run-TryBot: Dmitry Vyukov <[email protected]> TryBot-Result: Gobot Gobot <[email protected]> Reviewed-by: thepudds <[email protected]> Reviewed-by: Josh Bleecher Snyder <[email protected]> Reviewed-by: Dmitry Vyukov <[email protected]>
This change adds a sample Fuzz test function to package tiff, under the gofuzz build tag. The function is based on the tiff/tiff.go code, from github.com/dvyukov/go-fuzz-corpus. Fixes golang/go#30719 Updates golang/go#19109 Change-Id: I78771e9a1bd01651ba6ca421ba41f0c0e95d0c53 Reviewed-on: https://go-review.googlesource.com/c/image/+/167097 Run-TryBot: Dmitry Vyukov <[email protected]> TryBot-Result: Gobot Gobot <[email protected]> Reviewed-by: thepudds <[email protected]> Reviewed-by: Josh Bleecher Snyder <[email protected]> Reviewed-by: Dmitry Vyukov <[email protected]>
This change adds a sample Fuzz test function to package tiff, under the gofuzz build tag. The function is based on the tiff/tiff.go code, from github.com/dvyukov/go-fuzz-corpus. Fixes golang/go#30719 Updates golang/go#19109 Change-Id: I78771e9a1bd01651ba6ca421ba41f0c0e95d0c53 Reviewed-on: https://go-review.googlesource.com/c/image/+/167097 Run-TryBot: Dmitry Vyukov <[email protected]> TryBot-Result: Gobot Gobot <[email protected]> Reviewed-by: thepudds <[email protected]> Reviewed-by: Josh Bleecher Snyder <[email protected]> Reviewed-by: Dmitry Vyukov <[email protected]>
This change adds a sample Fuzz test function to package tiff, under the gofuzz build tag. The function is based on the tiff/tiff.go code, from github.com/dvyukov/go-fuzz-corpus. Fixes golang/go#30719 Updates golang/go#19109 Change-Id: I78771e9a1bd01651ba6ca421ba41f0c0e95d0c53 Reviewed-on: https://go-review.googlesource.com/c/image/+/167097 Run-TryBot: Dmitry Vyukov <[email protected]> TryBot-Result: Gobot Gobot <[email protected]> Reviewed-by: thepudds <[email protected]> Reviewed-by: Josh Bleecher Snyder <[email protected]> Reviewed-by: Dmitry Vyukov <[email protected]>
This change adds a sample Fuzz test function to package tiff, under the gofuzz build tag. The function is based on the tiff/tiff.go code, from github.com/dvyukov/go-fuzz-corpus. Fixes golang/go#30719 Updates golang/go#19109 Change-Id: I78771e9a1bd01651ba6ca421ba41f0c0e95d0c53 Reviewed-on: https://go-review.googlesource.com/c/image/+/167097 Run-TryBot: Dmitry Vyukov <[email protected]> TryBot-Result: Gobot Gobot <[email protected]> Reviewed-by: thepudds <[email protected]> Reviewed-by: Josh Bleecher Snyder <[email protected]> Reviewed-by: Dmitry Vyukov <[email protected]>
Filing a proposal on behalf of @kcc and @dvyukov:
They request that cmd/go support fuzzing natively, just like it does tests and benchmarks and race detection today.
https://github.com/dvyukov/go-fuzz exists but it's not as easy as writing tests and benchmarks and running "go test -race" today.
Should we make this easier?
Motivation
Proposal
The text was updated successfully, but these errors were encountered: