Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use results from previous tests to inform shrinks #401

Open
PeterRugg opened this issue May 13, 2024 · 11 comments
Open

Use results from previous tests to inform shrinks #401

PeterRugg opened this issue May 13, 2024 · 11 comments

Comments

@PeterRugg
Copy link

This is a somewhat esoteric feature: it might just be us that want it, but thought I'd ask because it seems like it might be simple to add.

We're using quickcheck to compare hardware implementations (https://github.com/CTSRD-CHERI/QuickCheckVEngine/), and would like to use the outputs from runs to decide how to shrink.

To be more specific: we generate a test, which is a sequence of instructions, we call out to the hardware implementations that annotate the sequences of instructions with the results from running the instructions, then we check that the annotations match.

It would be cool if we could input the annotated traces into the shrink function, rather than the unannotated ones. I believe this could be achieved by providing a slightly different API that splits up the annotation of a test from evaluating whether it was a pass or fail, i.e. a version of forAllShrink with the type changed from:

forAllShrink :: (Show a, Testable prop) => Gen a -> (a -> [a]) -> (a -> prop) -> Property

to

forAllShrink :: (Show a, Testable prop) => Gen a -> (a -> b) -> (b -> [a]) -> (b -> prop) -> Property

I'd be interested to hear if you have any thoughts on how tricky this would be (possibly complicated by b being an IO) and whether we've missed a way to do it using the existing API.

Thanks for maintaining a great library!

@Rewbert
Copy link
Collaborator

Rewbert commented May 13, 2024

Hi Peter!

I believe you can hack this together using existing features, even if it might seem a bit contrived :) It seems that you would want to decouple generation and shrinking. This is something that I've done myself a few times, and I know that my advisor has also run into this several times. I am not sure what a good combinator for this would look like, even if I know that there is room for one.

The trick is to install a monitor! Here is my pseudocode trying to illustrate what I envision.

instance Arbitrary AnnotatedTrace where
  gen = error "we don't need this, I think?"

  shrink trace = -- this is a good shrinker, working on annotated traces!

prop_good_shrink :: AnnotatedTrace -> Property
prop_good_shrink trace =  -- check the property on this trace. Presumably this logic is somewhere in your code
                          -- now your good shrinker can take over

-- when you run this property, remove the shrinker for the unannotated trace
-- e.g. call forAll instead of forAllShrink
prop = forAll trace $ \unannotated_trace -> do
  ...
  ...
  annotated <- some_call unannotated_trace -- your hardware is annotating the trace here
  ...
  monitor $ whenFail $ do
    -- in here you can write code that is run once after shrinking,
    -- but since we called forAll, we will call this immediately when the bug is found.
    --
    -- Here, call QC recursively! Invoke another property that operates on annotated traces,
    -- and just fails immediately and shrinks it
    quickCheck $ withMaxSuccess 1 $ prop_good_shrink annotated
  --
  assert $ do_your_test_here -- monitor has to be installed before the assertion

@Rewbert
Copy link
Collaborator

Rewbert commented May 13, 2024

I can expand on this answer tomorrow if needed :)

@MaximilianAlgehed
Copy link
Collaborator

MaximilianAlgehed commented May 14, 2024

Another way of doing this user-side that doesn't add any maintainance burden to QuickCheck is to introduce a type data WithAnnotation a b = WithAnnotation { val :: a, ann :: b }. Here you could do something like this (I haven't type checked, but I think the idea is roughly right):

data WithAnnotation a b = WithAnnotation { val :: a, ann : b }

mkAnn :: (a -> b) -> a -> WithAnnotation a b
mkAnn ann a = WithAnnotation a (ann a)

forAllShrink' :: ... => Gen a -> (a -> b) -> (b -> [a]) -> (b -> prop) -> Property
forAllShrink' gen annotate shr prop =
  forAllShrink (mkAnn annotate <$> gen)
                      (map (mkAnn annotate) . shr . ann)
                      (prop . ann)

But the thing I'm more curious about now is what the point of ever materializing the [a] is? You could just do this all in the annotated b world.

@PeterRugg
Copy link
Author

Thank you both for the quick responses!

I think what @Rewbert suggested could be the most pragmatic way forwards! The sad thing is that shrinking will likely change the behaviour of the test, and so make the annotations outdated once one or two shrinks have happened, but it's possible this isn't too important in practice.

I think @MaximilianAlgehed's suggestion would work, except that we need to do IO to get the annotations (calling out over TCP sockets to simulated hardware).

But the thing I'm more curious about now is what the point of ever materializing the [a] is? You could just do this all in the annotated b world.

This is because the annotations for a given instruction change based on what occurred before them in the test, e.g. [set register 1 to 7, add 3 to register 1, add 4 to register 1] would get annotated as [set register 1 to 7 (7), add 3 to register 1 (10), add 4 to register 1 (14)] but then would shrink (e.g. deleting the middle instruction) to [set register 1 to 7 (7), add 4 to register 1 (11)]: we need to rerun the implementation (which QuickCheck has done for us anyway!) to refresh the annotation.

@MaximilianAlgehed
Copy link
Collaborator

@PeterRugg What's the type for your modified forAllShrink that you actually want - given the need for IO?

@nick8325
Copy link
Owner

It seems like what's missing right now is the ability for shrinking functions to do I/O. That is, something like:

forAllShrinkIO :: (Show a, Testable prop) => Gen a -> (a -> IO [a]) -> (a -> prop) -> Property
forAllShrinkIO gen shr prop = forAllShrink gen (unsafePerformIO . shr) prop

(Maybe even with Gen (IO a)?)

Of course that function shouldn't be using unsafePerformIO, but I implemented it that way just to experiment. And it seems like it supports this use case!

Here is a forAllShrink' adapted to do I/O:

instance Show a => Show (WithAnnotation a b) where show = show . val

forAllShrink' :: forall a b prop. (Show a, Testable prop) => Gen a -> (a -> IO b) -> (a -> b -> [a]) -> (a -> b -> prop) -> Property
forAllShrink' gen annotate shr prop =
  forAllShrinkIO gen' shr' prop'
  where
    gen' :: Gen (WithAnnotation a (IO b))
    gen' = mkAnn annotate <$> gen

    shr' :: WithAnnotation a (IO b) -> IO [WithAnnotation a (IO b)]
    shr' (WithAnnotation val annIO) = do
      ann <- annIO
      return $ map (mkAnn annotate) $ shr val ann

    prop' :: WithAnnotation a (IO b) -> Property
    prop' (WithAnnotation val annIO) =
      idempotentIOProperty $ do
        ann <- annIO
        return $ prop val ann

and here is how one can use it:

type TestCase = ...
type Annotation = ...
annotate :: TestCase -> IO Annotation
shrinkWithAnnotation :: TestCase -> Annotation -> [TestCase]

prop_ioshrink :: Property
prop_ioshrink =
  forAllShrink' arbitrary annotate shrinkWithAnnotation $ \testCase annotation ->
    ...

@nick8325
Copy link
Owner

Oh, this is not quite right... The I/O will be executed twice, once in prop' and once in shr'...

@PeterRugg
Copy link
Author

@MaximilianAlgehed Good question. Thinking it through more, I think we'd need a variant of Testable with two type arguments, ExtTestable prop disproof (existing Testable prop would be ExtTestable prop ()), have ExtProperty disproof with property :: prop -> ExtProperty disproof. Whenever the prop fails, it would include the disproof that it failed with. Then forAllShrinkExt would have type (Show a, ExtTestable prop disproof) => Gen a -> (a -> disproof -> [a]) -> (a -> prop) -> ExtProperty disproof, with the existing forAllShrink calling shrink with second argument (). QuickCheck itself would then just take the disproof produced by calling prop and feed it into the shrink function as extra info.

Note that your WithAnnotation solution is slightly sad even without IO in that it requires you to compute the annotation twice for every test case (once on input to prop and once on input to shrink), when it could be expensive (subject to Haskell noticing the duplicate pure call and optimising it away).

Happy to close this issue, because that sounds fairly disruptive, and approximate workarounds have been suggested.

@PeterRugg
Copy link
Author

@nick8325 Indeed! That would get us somewhere. It's sad that it's happening twice, but maybe that's okay. In particular, in the worst cases for shrinking, you're trying lots of shrinks that remove the failing behaviour, then only a few that preserve it. It's only on these latter cases you'd have duplicate work, so it probably wouldn't affect overall shrinking time too much.

@nick8325
Copy link
Owner

@PeterRugg True, maybe it's not so bad in this case! Still, it would be nice to find a way to only make it happen once.

Here is another approach by the way, considerably clunkier but more flexible as it lets you separate the generation from the annotation. The idea is to use an IORef to store the annotation and read it in the shrinker, then use whenFail' to update the IORef whenever a property fails:

prop_ioshrink :: Property
prop_ioshrink =
  idempotentIOProperty $ do
    ref <- newIORef undefined
    let shrinker x = do
          val <- readIORef ref
          return (shrinkWithAnnotation x val)

    return $ forAllShrinkIO arbitrary shrinker $ \x ->
      ioProperty $ do
        val <- annotate x
        return $
          whenFail' (writeIORef ref val) $
             ... property goes here ...

@PeterRugg
Copy link
Author

@nick8325 Thanks! That might be the way forward, and using unsafeIO probably gives us the way to do that in practice right now...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants