[dev.fuzz] testing: use generics with testing.F #46890

katiehockman · 2021-06-23T14:52:43Z

This API idea and explanation comes from @rogpeppe. He suggests a change to the testing.F type which uses generics, rather than interface{} types, for the Fuzz and Add functions. I haven't yet formed an opinion on whether or not we should do this, but I think this issue can help start that conversation.

/cc @golang/fuzzing

In Roger's own words:

package testing

// F is a type passed to fuzz targets for fuzz testing.
// V is the type of the argument to be fuzzed.
type F[V any] struct {
	...
}

// Fuzz runs the fuzz function, ff, for fuzz testing.
func (f *F[V]) Fuzz(ff func(t *T, x V))

// Add will add the argument to the seed corpus for the fuzz target. This will
// be a no-op if called after or within the Fuzz function.
func (f *F[V]) Add(x V)

// example fuzz function.
func FuzzFoo(f *testing.F[string]) {
	f.Add("some-piece-of-corpus")
	f.Fuzz(func(t *testing.T, x string)) { ... })
}

Essentially when you define a fuzz function, you also declare what type of arguments you expect to receive too. This is all compatible with the current generics proposal.

In my view, this does give some nice advantages:

the API is very clear (no need for interface{} function arguments)
there's a good performance improvement to be had from the generic version because you can avoid the need to use the rather slow reflect.Value.Call

Here's a little micro-benchmark that demonstrates some performance difference (the overhead of the call goes from 150ns to 6ns): https://go2goplay.golang.org/p/Nz0CRLC2DA0

Here's a little piece of code to demonstrate that the technique could actually work for real: https://go2goplay.golang.org/p/ygvwbHw_S11
[Update: using testing/quick for a tiny bit more realism: https://go2goplay.golang.org/p/FOlQQIH5YXI ]

The main thing I'm arguing for here is that I think it's a good idea to restrict fuzz functions to have a single argument. That means they're amenable to the type-safe approach outlined above, with the arguably minor restriction that you'd have to define a struct in order to pass multiple arguments.

The text was updated successfully, but these errors were encountered:

jayconrod · 2021-06-23T22:29:02Z

+1, I like this idea. We talked about this while drafting the proposal, but at that time, we weren't comfortable blocking development on generics (it seemed like we might land fuzzing in 1.17).

The performance difference is a bit surprising, but I don't think it's decisive yet. It depends a lot of how generics are implemented. It sounds like there will be separate, specialized code generated for each type (with some deduplication), so we might be able to avoid dynamic dispatch altogether. Hard to say though.

Mainly I like how the API is more obvious. Methods that accept interface{} are always a bit confusing.

Agree that the key tradeoff is that we lose multiple arguments though.

jayconrod · 2021-06-23T22:42:18Z

#46218 is somewhat related: we can use vet to type-check fuzz targets even without generics. We may want to do that anyway to detect misusage of testing.F, e.g., calling F.Fatal instead of T.Fatal in the fuzz function.

rogpeppe · 2021-06-24T08:25:49Z

The performance difference is a bit surprising, but I don't think it's decisive yet. It depends a lot of how generics are implemented. It sounds like there will be separate, specialized code generated for each type (with some deduplication), so we might be able to avoid dynamic dispatch altogether. Hard to say though.

FWIW I ran the benchmark on the Go tip (using -gcflags=-G=3), not using go2go, so the results might not be too inaccurate. However, regardless of that, ISTM that the generic code will always be much faster than using reflect.Value.Call just because it has so much less work to do (no intermediate slices, no need to type check other than the cheap .(V) type conversion). It's perhaps worth mentioning that this particular call will always be dynamically dispatched because we're calling a closure.

#46218 is somewhat related: we can use vet to type-check fuzz targets even without generics.

Not in all cases - I could imagine people using helper packages for fuzzing that end up invoking
Add with an interface{}-valued argument which might be hard to analyse.

However, although the vet check would work in almost all cases, it's still a poor cousin of the
compiler type checking, and the generic type signatures do make things much clearer.
For example, the current fuzz docs on the dev.fuzz branch don't actually describe the required
signature of the ff argument, but that's not necessary if it's there in the type signature.

We may want to do that anyway to detect misusage of testing.F, e.g., calling F.Fatal instead of T.Fatal in the fuzz function.

Yup, some vet checking would definitely still be useful.

ianlancetaylor · 2021-06-24T15:14:29Z

Just a note that performance measurements of the current implementation on the dev.typeparams branch are completely meaningless. Please don't try to draw any conclusions from how the branch behaves today.

katiehockman · 2021-06-29T16:06:41Z

I spent some time playing around with this, and doing some prototyping. I ran into a few roadblocks. I'm still pretty new in my understanding/use of generics, so these may very well be solvable problems, and if so, LMK.

You can't alias a generic type. This is going to be problematic for us, as we use aliasing to avoid importing the internal/fuzz package into the testing package. We might be able to work around this, but the readability and maintainability may suffer as a result. LMK if anyone has workarounds/ideas for this.
If we make testing.F a generic type, as is described in this proposal, then the types which use testing.F must also be generic. We use a cross-package InternalFuzzTarget type which has the fuzz target as a field. Making InternalFuzzTarget a generic type will be painful if not impossible with the way it is used and generated. Examples: [1] [2] [3]
- One potential workaround would be to make InternalFuzzTarget.Fn an interface{} then build the function using reflect inside testing/fuzz.go, but that will hurt readability, and I'm still not 100% sure that will work.

I haven't investigated this fully, so there may be other roadblocks as well.

Assuming we can get around these issues and can support generics, like Ian said, we can't assume anything about the performance yet. That said, it seems to me that the decision then comes down to which we value more: compile-time checking or easier support for fuzzing multiple types.

That's a hard call to make right now. Although compile-time checking would be nice, Jay is right in that vet checks would probably cover the majority of the use cases for this. We can always add more documentation and examples to make this clearer, too.
I'm not sure how often people will want to fuzz multiple types, but requiring them to use a struct in order to do this is kind of unfortunate. For example, I can see someone wanting to do something like this:

func FuzzCmp(f *testing.F) {
  f.Fuzz(func(t *testing.T, a, b string) {
    strings.Compare(a, b)
  }
}

with generics, it would have to be something like this:

~~func FuzzCmp(f *testing.F) {~~
~~type fuzzType struct {~~
~~a, b string~~
}
~~f.Fuzz(func(t *testing.T, x fuzzType) {~~
~~strings.Compare(x.a, x.b)~~
}
}

(Edit: updated code below, see following comments)

type fuzzCmpType struct {
  a, b string
}
func FuzzCmp(f *testing.F[fuzzCmpType]) {
  f.Fuzz(func(t *testing.T, x fuzzType) {
    strings.Compare(x.a, x.b)
  }
}

which really hurts readability and usability in my opinion.

ianlancetaylor · 2021-06-29T16:36:01Z

There is some discussion of aliasing of generic types over on #46477. A description of the use case here would be helpful there. Thanks.

It does seem clear that this proposal requires that f.Fuxx take only a single argument. That said, I think your example can be written as

func FuzzCmp(f *testing.F) {
    f.Fuzz(func(t *testing.T, x struct { a, b string }) {
        strings.Compare(x.a, x.b)
    }
}

which is still far from ideal but perhaps not quite as awful. One could also consider f.Fuzz2, f.Fuzz3, etc., for more arguments. (The ideal here would be generics with variadic type parameters, but that will certainly not be in the first release and I don't yet see a way to ever implement it.)

katiehockman · 2021-06-29T16:49:43Z

Oh, actually, I forgot that with generics the testing.F type also has to be generic. So I think that fuzz target would have to be more like this:

type fuzzCmpType struct {
  a, b string
}
func FuzzCmp(f *testing.F[fuzzCmpType]) {
  f.Add(fuzzCmpType{"abc", "acb"))
  f.Fuzz(func(t *testing.T, x fuzzType) {
    strings.Compare(x.a, x.b)
  }
}

vs.

func FuzzCmp(f *testing.F) {
  f.Add("abc", "acb")
  f.Fuzz(func(t *testing.T, a, b string) {
    strings.Compare(a, b)
  }
}

Which is much worse. LMK if I'm missing something here though.

ianlancetaylor · 2021-06-29T17:01:47Z

Good point, I think you are right. I think you'll have to pick either generic fuzzing or single fuzzing argument.

katiehockman · 2021-06-29T17:30:35Z

It's becoming apparent to me that the cost of supporting generics in this design would be pretty high, and I'm not yet convinced that fuzz testing is a good application of generics. There are a lot of situations where developers will benefit from fuzzing more than one argument at a time, and the overhead/complexity of needing to use a struct as shown in #46890 (comment) outweigh the benefits.

Here are some real-world examples of fuzz targets which use multiple types today (/cc @dsnet):

https://github.com/go-json-experiment/json/blob/master/fuzz_test.go#L41
https://github.com/go-json-experiment/json/blob/master/fuzz_test.go#L252
https://github.com/dvyukov/go-fuzz-corpus/blob/master/strings/main.go#L11 has a go-fuzz target which would benefit from having multiple types (as it is now, the []byte must be converted to a string then split manually)

dsnet · 2021-06-30T17:57:00Z

I don't have a strong opinion whether the generic approach is better or not.

The readability of the generic version is worse, but not so much worse that I find #46890 (comment) unbearable. If we had type inferencing for composite literals (see #12854), we could avoid the named type when calling f.Add.

On the other hand, the potential for performance gain is a non-trivial strength of the generic approach. The non-generic approach must use reflect.Call, which is fairly slow. One reason for the slowness is that the reflect.Call API forces many allocations. Even if that could be fixed (see #43732), it would only improve it by 1.5x. However, you would need something like a 20x improvement to match the performance of calling a function directly. The ability to call the fuzz function faster, means that we can fuzz more inputs per unit of time, which seems like a pretty strong strength of a generic approach.

Also, the type safety of the generic approach avoids issues like #45593.

katiehockman · 2021-07-14T21:18:14Z

I am going to close this issue as it seems like the discussions has fizzled to an end, and the general consensus is not to support generics.

rogpeppe · 2021-07-15T14:35:36Z

Sorry for the late comment here - I was on holiday when the above discussion happened.

I agree with @dsnet that the performance issue could be significant.

Also, thinking forward to generics in general, I suspect it is going to be quite common for a generic API to support calling an arbitrary function with a single argument, but it's not possible with the current proposal to support generalising such an API across functions with multiple arguments.

That is, a usability like the one found here seems likely to be a common issue across many generic APIs.

I hope that such a usability issue doesn't push all those APIs towards using interface{} and dynamic typing as is happening here, because I think the usability, safety and performance benefits of using a generic API are worth having. Why should we need to add special vet rules when the compiler will be able do a better job itself?

If we agree that this is a more general problem than just in this one API, then perhaps a more general solution might work.

Without changing the language, we could implement a tuple package:

package tuple

type T0 struct{}
type T2[A0, A1 any] struct {
    T0 A0
    T1 A1
}

// T returns all the tuple's values.
func (t T2[A0, A1]) T() (A0, A1) {
	return t.T0, t.T1
}

type Pair = T2

type T3[A0, A1, A2 any] struct {
    T0 A0
    T1 A1
    T2 A2
}

... etc

Then you could write your fuzzer like this, without the need to declare another type:

func FuzzCoder(f *testing.F[tuple.T2[int64, []byte]) {
	// Add a number of inputs to the corpus including valid and invalid data.
	for _, td := range coderTestdata {
		f.Add(tuple.T2{0, []byte(td.in)})
	}

	f.Fuzz(func(t *testing.T, arg tuple.T2[int64, []byte]) {
		seed, b := arg.T()
		rn := rand.NewSource(seed)
		etc
	})
}

Looking forward a bit further, I don't believe there's any reason we couldn't support tuples directly in the language.

It's still a little awkward compared to using the reflect.Call approach, but that's often true of DSL-type APIs - I believe it's worth paying a small usability price for the additional clarity and performance brought from the use of static types.

jayconrod · 2021-07-15T21:55:00Z

@rogpeppe Thanks for adding this. I don't think we'd really thought about adding TupleN types. It seems a little better than having to define a struct type for arguments, but the readability still doesn't seem good to me.

@katiehockman, @rolandshoemaker and I talked about this over a few calls. I think we reached a consensus internally, so I'll try to sum that up here.

Pros:

Parameterizing testing.F, F.Add, and F.Fuzz would improve type safety. Getting the compiler to tell us about errors is preferable to a vet analysis (not implemented yet) or run-time check (already implemented), though vet results would always be surfaced through go test at least.
From type signatures in documentation, testing.F may be more intuitive to use if it's clear that the user needs to specify a type and stick with it.
Performance of calling the fuzz function with generics might be better than reflect.Call.

Cons:

We're limited to a single fuzzable parameter. Currently, the ability to easily fuzz functions with mulitple parameters is very appealing. It's hard to estimate the value of this since most other stuff out there starts with []byte.
Users can work around this limitation by defining a struct type or using a generic TupleN type as above. The examples seem a lot less readable than what we have though.

Concerns:

If the generics proposal allowed variadic type arguments, I think we'd be much more tempted to use that. Maybe the langauge will support that some day, but not in 1.18. We don't want to hold our breaths.
We can't quantify the performance advantage of a generic call over reflect.Call. We could probably estimate an upper bound improvement by implementing a specialized version that only accepted []byte. Even so, the overhead of reflect.Call may be small compared with the mutator or the fuzz function itself.
As a pratical concern, we're running short on time to make big changes, and this would a lot of rewriting. If we did this, even if we spent much more time investigating and quantifying this, we'd need to cross some things off all: umbrella issue for release-blocking fuzzing work #47037 to ship in 1.18. @katiehockman already spent a few days prototyping this and timeboxed it.

rogpeppe · 2021-07-16T07:39:59Z

Another pro: you need less type conversions, as the compiler knows what the type is. For example, in the FuzzCoder function above, there was no need to use int64(0) because the usual constant rules do the conversion for us.

Performance of calling the fuzz function with generics might be better than reflect.Call.

It seems clear to me that the performance would definitely be better than reflect.Call, although you're of course right that the improvement might be dominated by the overhead of the mutator (we can't say anything about the fuzz function itself).
My reasoning for that is that we're seriously proposing a slices package to do operations on slices, including some operations on function values. If calling a generic function value was anywhere near the performance of reflect.Call, it wouldn't be particularly viable IMHO. Also, reflect.Call needs to do more work than any reasonable generic call: it always needs to check the type compatibility of the provided arguments, whereas a generic call mechanism does not because the compiler has done that already.

If the generics proposal allowed variadic type arguments, I think we'd be much more tempted to use that. Maybe the langauge will support that some day, but not in 1.18. We don't want to hold our breaths.

I don't see variadic type arguments on the horizon any time soon (I don't think anyone has any idea how they might work), but tuples themselves might fit in quite easily, leading to something like this, which is comparable in readability to the current proposal, IMHO.

func FuzzCoder(f *testing.F[(int64, []byte)]) {
	// Add a number of inputs to the corpus including valid and invalid data.
	for _, td := range coderTestdata {
		f.Add((0, []byte(td.in)))
	}

	f.Fuzz(func(t *testing.T, arg (int64, []byte)) {
		seed, b := arg.T()
		rn := rand.NewSource(seed)
		etc
	})
}

As a practical concern, we're running short on time to make big changes, and this would a lot of rewriting.

FWIW I took a brief look at this, and at first glance I don't think it would require a lot of rewriting. The main things that need to change AFAICS are the code in cmd/go that checks whether a function looks like a fuzz function, and the code inside F.Fuzz that actually does the invocation. That should be enough initially, although you'd probably also want to simplify all the code that currently uses an array of types/values to use a single type/value too.

A sketch of the changes to the testing package (diff against 988d024): https://gist.github.com/rogpeppe/ff213c3553143a1600005b0f71ad3628

The main problem with this for landing in 1.18 is that the Go compiler on tip doesn't currently support exported generic functions, and might not until later in the cycle, so that may well be a show stopper even if this proposal is deemed to be a good idea.

rsc · 2021-08-23T16:30:46Z

Using generics doesn't seem like a good user experience.
The non-generic version requires listing the argument types once,
while the generic version requires listing the argument types twice.

If there is a performance problem being solved by generics,
perhaps we can solve it with generated code instead?
The go command is already reading the test file and generating code.
If more needs to be generated, no big deal.
And then the user experience is not degraded just for performance.

katiehockman added NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. FeatureRequest Issues asking for a new feature that does not need a proposal. fuzz Issues related to native fuzzing support labels Jun 23, 2021

katiehockman added this to the Backlog milestone Jun 23, 2021

katiehockman mentioned this issue Jul 2, 2021

all: umbrella issue for release-blocking fuzzing work #47037

Closed

22 tasks

katiehockman closed this as completed Jul 14, 2021

golang locked and limited conversation to collaborators Aug 23, 2022

gopherbot added the FrozenDueToAge label Aug 23, 2022

julieqiu added this to Go Security Sep 8, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[dev.fuzz] testing: use generics with testing.F #46890

[dev.fuzz] testing: use generics with testing.F #46890

katiehockman commented Jun 23, 2021 •

edited by rogpeppe

Loading

jayconrod commented Jun 23, 2021

jayconrod commented Jun 23, 2021

rogpeppe commented Jun 24, 2021

ianlancetaylor commented Jun 24, 2021

katiehockman commented Jun 29, 2021 •

edited

Loading

ianlancetaylor commented Jun 29, 2021

katiehockman commented Jun 29, 2021

ianlancetaylor commented Jun 29, 2021

katiehockman commented Jun 29, 2021 •

edited

Loading

dsnet commented Jun 30, 2021 •

edited

Loading

katiehockman commented Jul 14, 2021

rogpeppe commented Jul 15, 2021

jayconrod commented Jul 15, 2021

rogpeppe commented Jul 16, 2021

rsc commented Aug 23, 2021

[dev.fuzz] testing: use generics with testing.F #46890

[dev.fuzz] testing: use generics with testing.F #46890

Comments

katiehockman commented Jun 23, 2021 • edited by rogpeppe Loading

jayconrod commented Jun 23, 2021

jayconrod commented Jun 23, 2021

rogpeppe commented Jun 24, 2021

ianlancetaylor commented Jun 24, 2021

katiehockman commented Jun 29, 2021 • edited Loading

ianlancetaylor commented Jun 29, 2021

katiehockman commented Jun 29, 2021

ianlancetaylor commented Jun 29, 2021

katiehockman commented Jun 29, 2021 • edited Loading

dsnet commented Jun 30, 2021 • edited Loading

katiehockman commented Jul 14, 2021

rogpeppe commented Jul 15, 2021

jayconrod commented Jul 15, 2021

rogpeppe commented Jul 16, 2021

rsc commented Aug 23, 2021

katiehockman commented Jun 23, 2021 •

edited by rogpeppe

Loading

katiehockman commented Jun 29, 2021 •

edited

Loading

katiehockman commented Jun 29, 2021 •

edited

Loading

dsnet commented Jun 30, 2021 •

edited

Loading