PI and EI under gaussian noise assumption

This repository contains Python code for Bayesian optimization PI, EI and a modification of PI (MPI) and EI (MEI) under gaussian noise assumption at in loss function. The math detailed in Modifications of PI and EI under Gaussian Noise Assumption in Current Optima. This repo has three files:

bo_acquis.py: code for Bayesian Optimisation, PI and EI modified from bayesian-optimization, and new code for MPI and MEI.
plotters.py : plotter functions for plotting surface for estimated loss and acquisition value in each iteration adapted from bayesian-optimization. that contains the optimization code, and utility functions to plot iterations of the algorithm, respectively.
PI_EI_MPI_MEI_Benchmark.ipynb: A tutorial that uses the Bayesian algorithm with the 4 acquisitions to find the global optima on noise corrupted benchmark functions.

The signature of the optimization function is still:

bayesian_optimisation(n_iters, sample_loss, bounds, x0=None, n_pre_samples=5,
                      gp_params=None, random_search=False, alpha=1e-5, epsilon=1e-7)

Background

Probability of improvement(PI) and expected improvement(EI) are calculated with respect to current optima $\tilde{y}$. In some cases, the evaluations on loss function has a gaussian noise $y_i \sim \mathcal{N} (f(\mathbf{x})_i,\sigma^2_y)$. Here we modifie PI and EI under the assumption that all observations including current optima has a noise. They calculate probability of improvement and expected improvement with respect to posterior mean $\mu(\tilde{\mathbf{x}})$ and variance $\kappa(\tilde{\mathbf{x}},\tilde{\mathbf{x}})$ at loss optimum instead. (where $\tilde{\mathbf{x}}$ is parameter setting at current optima.) To lean the gaussian noise in observations, we add a white kernel into the originally adopted GP matern kernel. This enables uncertainty quantification at evaluated locations.

Let $\rho$ denotes $\sqrt{\kappa (\mathbf{x}, \mathbf{x})+ \kappa (\tilde{\mathbf{x}}, \tilde{\mathbf{x}})-2 \kappa (\mathbf{x}, \tilde{\mathbf{x}})}$. Mathematical expression of Modified PI and EI under gaussian noise assumption:

$$ \text{Modified PI: } a_{MPI}(x) = \Phi \left(\frac{\mu(\tilde{\mathbf{x}}) - \mu ( \mathbf{x} ) }{\rho})\right) $$

$$ \text{Modified EI: } a_{MEI} = \Phi(\frac{\mu(\tilde{\mathbf{x}}) - \mu(\mathbf{x})}{\rho})(\mu(\tilde{\mathbf{x}}) - \mu(\mathbf{x}))+ \phi(\frac{\mu(\tilde{\mathbf{x}}) - \mu(\mathbf{x})}{\rho})\rho $$

Current Experiment Result

We test Bayesian Optimisation with 4 acquisition functions at finding the global minima on benchmark functions. PI and EI under GP model with original matern kernel and matern+white kernel are both tested as a control group.

Together with white kernel, MPI shows a better result and more stable performance than PI on most of the benchmark functions against pre-set gaussian noise $\mathcal{N}(\mu=0,\sigma = 10)$, and is believed to be even better when the noise becomes bigger.

Below is the lowest loss we achieved on each benchmark function adding a gaussian noise $\mathcal{N}(\mu=0,\sigma = 10)$. Bayesian Optimisation parameter-setting is : iter = 45, random_search=10000. The result is averaged throughout 30 repeated trails, in (mean±std). All result at Cloud Drive.

acquisition functions	six-hump	rastrigin	goldstein	rotated-hyper-ellipsoid	sphere
MPI, kernel=matern+white	-21.58±5.30	-10.20±6.10	10.77±5.85	-21.54±5.40	-15.28±6.77
MEI, kernel=matern+white	-20.34±4.04	-10.98±4.64	24.20±8.47	-18.11±4.97	-15.65±5.29
PI, kernel=matern+white	-14.96±5.34	-3.34±9.15	28.83±18.46	14.70±76.71	-12.20±5.68
EI, kernel=matern+white	-16.56±5.82	-4.60±8.27	23.60±7.19	-18.84±5.61	-14.68±5.36
PI, kernel=matern	-21.75±5.28	-6.39±6.85	13.16±6.09	-16.33±4.51	-15.14±5.83
EI, kernel=matern	-20.57±4.74	-8.29±7.45	22.9±47.27	-18.43±5.15	-13.52±6.05

Perform Bayesian Optimisation on rastrigin function with PI (kernel=matern) and MPI (kernel = matern+white); probability of improvement and loss surface in each iteration is plotted. Here MPI performs more like PI on noise-less loss surface, which focus to exploit at one point, whereas PI is disturbed by noise and lost its focus.

Rastrigin Surface	PI Searching Trajectory	MPI Searching Trajectory

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

PI and EI under gaussian noise assumption

Background

Current Experiment Result

Files

README.md

Latest commit

History

README.md

File metadata and controls

PI and EI under gaussian noise assumption

Background

Current Experiment Result