Use results from previous tests to inform shrinks #401

PeterRugg · 2024-05-13T16:45:55Z

This is a somewhat esoteric feature: it might just be us that want it, but thought I'd ask because it seems like it might be simple to add.

We're using quickcheck to compare hardware implementations (https://github.com/CTSRD-CHERI/QuickCheckVEngine/), and would like to use the outputs from runs to decide how to shrink.

To be more specific: we generate a test, which is a sequence of instructions, we call out to the hardware implementations that annotate the sequences of instructions with the results from running the instructions, then we check that the annotations match.

It would be cool if we could input the annotated traces into the shrink function, rather than the unannotated ones. I believe this could be achieved by providing a slightly different API that splits up the annotation of a test from evaluating whether it was a pass or fail, i.e. a version of forAllShrink with the type changed from:

forAllShrink :: (Show a, Testable prop) => Gen a -> (a -> [a]) -> (a -> prop) -> Property

to

forAllShrink :: (Show a, Testable prop) => Gen a -> (a -> b) -> (b -> [a]) -> (b -> prop) -> Property

I'd be interested to hear if you have any thoughts on how tricky this would be (possibly complicated by b being an IO) and whether we've missed a way to do it using the existing API.

Thanks for maintaining a great library!

The text was updated successfully, but these errors were encountered:

Rewbert · 2024-05-13T17:16:17Z

Hi Peter!

I believe you can hack this together using existing features, even if it might seem a bit contrived :) It seems that you would want to decouple generation and shrinking. This is something that I've done myself a few times, and I know that my advisor has also run into this several times. I am not sure what a good combinator for this would look like, even if I know that there is room for one.

The trick is to install a monitor! Here is my pseudocode trying to illustrate what I envision.

instance Arbitrary AnnotatedTrace where
  gen = error "we don't need this, I think?"

  shrink trace = -- this is a good shrinker, working on annotated traces!

prop_good_shrink :: AnnotatedTrace -> Property
prop_good_shrink trace =  -- check the property on this trace. Presumably this logic is somewhere in your code
                          -- now your good shrinker can take over

-- when you run this property, remove the shrinker for the unannotated trace
-- e.g. call forAll instead of forAllShrink
prop = forAll trace $ \unannotated_trace -> do
  ...
  ...
  annotated <- some_call unannotated_trace -- your hardware is annotating the trace here
  ...
  monitor $ whenFail $ do
    -- in here you can write code that is run once after shrinking,
    -- but since we called forAll, we will call this immediately when the bug is found.
    --
    -- Here, call QC recursively! Invoke another property that operates on annotated traces,
    -- and just fails immediately and shrinks it
    quickCheck $ withMaxSuccess 1 $ prop_good_shrink annotated
  --
  assert $ do_your_test_here -- monitor has to be installed before the assertion

Rewbert · 2024-05-13T17:16:34Z

I can expand on this answer tomorrow if needed :)

MaximilianAlgehed · 2024-05-14T07:35:10Z

Another way of doing this user-side that doesn't add any maintainance burden to QuickCheck is to introduce a type data WithAnnotation a b = WithAnnotation { val :: a, ann :: b }. Here you could do something like this (I haven't type checked, but I think the idea is roughly right):

data WithAnnotation a b = WithAnnotation { val :: a, ann : b }

mkAnn :: (a -> b) -> a -> WithAnnotation a b
mkAnn ann a = WithAnnotation a (ann a)

forAllShrink' :: ... => Gen a -> (a -> b) -> (b -> [a]) -> (b -> prop) -> Property
forAllShrink' gen annotate shr prop =
  forAllShrink (mkAnn annotate <$> gen)
                      (map (mkAnn annotate) . shr . ann)
                      (prop . ann)

But the thing I'm more curious about now is what the point of ever materializing the [a] is? You could just do this all in the annotated b world.

PeterRugg · 2024-05-14T10:05:58Z

Thank you both for the quick responses!

I think what @Rewbert suggested could be the most pragmatic way forwards! The sad thing is that shrinking will likely change the behaviour of the test, and so make the annotations outdated once one or two shrinks have happened, but it's possible this isn't too important in practice.

I think @MaximilianAlgehed's suggestion would work, except that we need to do IO to get the annotations (calling out over TCP sockets to simulated hardware).

But the thing I'm more curious about now is what the point of ever materializing the [a] is? You could just do this all in the annotated b world.

This is because the annotations for a given instruction change based on what occurred before them in the test, e.g. [set register 1 to 7, add 3 to register 1, add 4 to register 1] would get annotated as [set register 1 to 7 (7), add 3 to register 1 (10), add 4 to register 1 (14)] but then would shrink (e.g. deleting the middle instruction) to [set register 1 to 7 (7), add 4 to register 1 (11)]: we need to rerun the implementation (which QuickCheck has done for us anyway!) to refresh the annotation.

MaximilianAlgehed · 2024-05-14T11:40:54Z

@PeterRugg What's the type for your modified forAllShrink that you actually want - given the need for IO?

nick8325 · 2024-05-14T14:21:01Z

It seems like what's missing right now is the ability for shrinking functions to do I/O. That is, something like:

forAllShrinkIO :: (Show a, Testable prop) => Gen a -> (a -> IO [a]) -> (a -> prop) -> Property
forAllShrinkIO gen shr prop = forAllShrink gen (unsafePerformIO . shr) prop

(Maybe even with Gen (IO a)?)

Of course that function shouldn't be using unsafePerformIO, but I implemented it that way just to experiment. And it seems like it supports this use case!

Here is a forAllShrink' adapted to do I/O:

instance Show a => Show (WithAnnotation a b) where show = show . val

forAllShrink' :: forall a b prop. (Show a, Testable prop) => Gen a -> (a -> IO b) -> (a -> b -> [a]) -> (a -> b -> prop) -> Property
forAllShrink' gen annotate shr prop =
  forAllShrinkIO gen' shr' prop'
  where
    gen' :: Gen (WithAnnotation a (IO b))
    gen' = mkAnn annotate <$> gen

    shr' :: WithAnnotation a (IO b) -> IO [WithAnnotation a (IO b)]
    shr' (WithAnnotation val annIO) = do
      ann <- annIO
      return $ map (mkAnn annotate) $ shr val ann

    prop' :: WithAnnotation a (IO b) -> Property
    prop' (WithAnnotation val annIO) =
      idempotentIOProperty $ do
        ann <- annIO
        return $ prop val ann

and here is how one can use it:

type TestCase = ...
type Annotation = ...
annotate :: TestCase -> IO Annotation
shrinkWithAnnotation :: TestCase -> Annotation -> [TestCase]

prop_ioshrink :: Property
prop_ioshrink =
  forAllShrink' arbitrary annotate shrinkWithAnnotation $ \testCase annotation ->
    ...

nick8325 · 2024-05-14T14:23:34Z

Oh, this is not quite right... The I/O will be executed twice, once in prop' and once in shr'...

PeterRugg · 2024-05-14T14:26:26Z

@MaximilianAlgehed Good question. Thinking it through more, I think we'd need a variant of Testable with two type arguments, ExtTestable prop disproof (existing Testable prop would be ExtTestable prop ()), have ExtProperty disproof with property :: prop -> ExtProperty disproof. Whenever the prop fails, it would include the disproof that it failed with. Then forAllShrinkExt would have type (Show a, ExtTestable prop disproof) => Gen a -> (a -> disproof -> [a]) -> (a -> prop) -> ExtProperty disproof, with the existing forAllShrink calling shrink with second argument (). QuickCheck itself would then just take the disproof produced by calling prop and feed it into the shrink function as extra info.

Note that your WithAnnotation solution is slightly sad even without IO in that it requires you to compute the annotation twice for every test case (once on input to prop and once on input to shrink), when it could be expensive (subject to Haskell noticing the duplicate pure call and optimising it away).

Happy to close this issue, because that sounds fairly disruptive, and approximate workarounds have been suggested.

PeterRugg · 2024-05-14T14:29:42Z

@nick8325 Indeed! That would get us somewhere. It's sad that it's happening twice, but maybe that's okay. In particular, in the worst cases for shrinking, you're trying lots of shrinks that remove the failing behaviour, then only a few that preserve it. It's only on these latter cases you'd have duplicate work, so it probably wouldn't affect overall shrinking time too much.

nick8325 · 2024-05-14T14:39:04Z

@PeterRugg True, maybe it's not so bad in this case! Still, it would be nice to find a way to only make it happen once.

Here is another approach by the way, considerably clunkier but more flexible as it lets you separate the generation from the annotation. The idea is to use an IORef to store the annotation and read it in the shrinker, then use whenFail' to update the IORef whenever a property fails:

prop_ioshrink :: Property
prop_ioshrink =
  idempotentIOProperty $ do
    ref <- newIORef undefined
    let shrinker x = do
          val <- readIORef ref
          return (shrinkWithAnnotation x val)

    return $ forAllShrinkIO arbitrary shrinker $ \x ->
      ioProperty $ do
        val <- annotate x
        return $
          whenFail' (writeIORef ref val) $
             ... property goes here ...

PeterRugg · 2024-05-14T14:48:33Z

@nick8325 Thanks! That might be the way forward, and using unsafeIO probably gives us the way to do that in practice right now...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use results from previous tests to inform shrinks #401

Use results from previous tests to inform shrinks #401

PeterRugg commented May 13, 2024

Rewbert commented May 13, 2024

Rewbert commented May 13, 2024

MaximilianAlgehed commented May 14, 2024 •

edited

Loading

PeterRugg commented May 14, 2024

MaximilianAlgehed commented May 14, 2024

nick8325 commented May 14, 2024

nick8325 commented May 14, 2024

PeterRugg commented May 14, 2024

PeterRugg commented May 14, 2024

nick8325 commented May 14, 2024

PeterRugg commented May 14, 2024

Use results from previous tests to inform shrinks #401

Use results from previous tests to inform shrinks #401

Comments

PeterRugg commented May 13, 2024

Rewbert commented May 13, 2024

Rewbert commented May 13, 2024

MaximilianAlgehed commented May 14, 2024 • edited Loading

PeterRugg commented May 14, 2024

MaximilianAlgehed commented May 14, 2024

nick8325 commented May 14, 2024

nick8325 commented May 14, 2024

PeterRugg commented May 14, 2024

PeterRugg commented May 14, 2024

nick8325 commented May 14, 2024

PeterRugg commented May 14, 2024

MaximilianAlgehed commented May 14, 2024 •

edited

Loading