Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Numeric literals sampling #1389

Draft
wants to merge 87 commits into
base: master
Choose a base branch
from
Draft

Numeric literals sampling #1389

wants to merge 87 commits into from

Conversation

rybla
Copy link
Collaborator

@rybla rybla commented Aug 3, 2022

The current problem with :checking functions that are polymorphic in numeric types

When checking a polymorphic function, Cryptol chooses default instantiations of the parametrized type variables that satisfy the function's type constraints. There is a system of rules for choosing default non-numeric type instantiations according to the set of type classes that type variable is constrained by. For numeric type variables, a default instance is found by asking the SMT to try and solve the numeric constraints and then give back a single solving substitution.

For testing functions that are polymorphic in numeric types, this method of default instantiation is
unsatisfactory, because the SMT solver only provides a single solution, and that solution is usually trivial (i.e. the smallest literals that satisfy the constraints). For example:

module Test where

map_add1 : {n} (fin n) => [n][8] -> [n][8]
map_add1 xs = [ x + 1 | x <- xs ]

property clearly_unsatisfied xs = xs == map_add1 xs 
Cryptol> :l Test
Test> :check clearly_unsatisfied
Showing a specific instance of polymorphic result:
  * Using '0' for signature variable 'n'
Using exhaustive testing.
Passed 1 tests.
Q.E.D.

Of course, clearly_unsatisfied is clearly unsatisfied, since adding 1 to all the elements of a sequence is not the identity. But, since the SMT solver found n = 0 to be a solution that satisfies fin n, it just tests that and finds that [] = [] indeed.

New feature: sampling over solution substitutions of the numeric types

This PR implements a new feature that, when turned on, replaces defaulting when getting a concrete type for a function during :checking. It can be turned on via

Cryptol> :set literalSampling = on
Cryptol> :set literalSamplingBin = X

where X is the number of tests to run for each sampled substitution of literals (the default is 10).

The literal sampler algorithm takes into account the constraints over numeric literals in the type signature of the function being :checked, and tries to sample with a certain distribution over the solutions. Only a certain domain of constraints are currently supported. Let a, b, c, ... be numeric type variables and x, y, z, ... be constant numeric literals. Then the following subset of the cryptol type constraint syntax is allowed to mention numeric type variables in a function in order to be accepted by the literal sampling algorithm:

  • fin _
  • _ == _, _ <= _, _ >= _
  • _ + _, _ - _
  • a * x, a % x, a / x
  • x * y, x / y, x % y, x ^^ y

If a variable is only bounded above by inf, then it is sampled along an exponential distribution up to a maximum value, where higher values are less likely to be sampled. If a variable is bounded above by a constant, then it is sampled on a uniform distribution up to that upper bounded. The order of variable sampling should not affect the distribution (I'm still working out how true this is).

Once a solution substitution is sampled, the instantiation of the function type with that substitution is :checked as if it was a normal top-level function, except for only literal-sampling-bin-size number of tests. Then another solution is sampled and :checked, etc, until the the total number of tests performed is at least tests.

TODO

Things @Riib11 didn't finish before his internship ended.

  • setup tests to be run by test-runner
  • make literal-sampling also run when you use :check without any arguments

@rybla rybla added the language Changes or extensions to the language label Aug 3, 2022
@rybla
Copy link
Collaborator Author

rybla commented Aug 31, 2022

Mostly code style stuff here.

I haven't fully understood what the sampling algorithm is doing, but the steps looks well-documented. I'd definitely like to better understand what's going on with the Q' and Z' types.

Here's a high-level description of the entire sampling algorithm:

High-level description of numeric literal sampling algorithm:

@rybla rybla self-assigned this Sep 2, 2022
@rybla rybla requested a review from yav September 2, 2022 23:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
language Changes or extensions to the language
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants