Fishtest Mathematics

Fishtest uses the pentanomial model for engine matches. This model is more accurate than the traditional trinomial model (win,draw,loss) and leads to a substantial saving of resources.
For sequential testing Fishtest uses the GSPRT, which is a generalization of the SPRT in which unknown parameters are replaced by their maximum likelihood estimates under H0 and H1. It is shown in loc. cit. that the GSPRT asymptotically behaves like the SPRT when the error probabilities go to zero. However, the asymptotic guarantees of the GSPRT do not depend on Elo differences being small.
Fishtest expresses the bounds for the GSPRT tests in terms of "normalized Elo". This has the advantage that the expected duration of a test depends only on the bounds and not on the draw ratio or the opening book. See this document ^[archive].
For parameter tuning Fishtest uses the SPSA algorithm.
To calculate Elo estimates from the results of GSPRT tests, Fishtest uses the formula (6.1) in this document ^[archive]. To model a GSPRT as a random walk, we use the formula (2.1) of this document ^[archive] which essentially shows that a GSPRT may be approximated by an SPRT for the drift of a Brownian motion where the infinitesimal variance is estimated from the sample.
The SPRT calculator and the resource consumption tables are based on the formulas in this document ^[archive].
Fishtest uses a χ²-test to detect anomalous workers.

Fishtest calculates some interesting meta information about tests.

The difference between the pentanomial and the trinomial model yields an estimate for the Root Mean Square value of the biases in the opening book by the accounting identity ^[archive].

Many formulas used in Fishtest were validated by simulation. See here: simul and here: pentanomial.

Provide feedback