Integrating over something else than priors #1276

ArnoVel · 2019-12-09T20:30:03Z

ArnoVel
Dec 9, 2019

Hi,

I'm currently trying to modify some code related to Heteroskedastic GPs, and want to add some MCMC component to it, to essentially circumvent some instability in the convergence process.
The dataset is quite small (1d, few thousand points at most), so performance is not a huge issue, although I want to have a somewhat efficient code.

The problem is as follows:

First estimate the noise through some homoskedastic GP
Fit a gp to the log of the noise levels. This is ideally the distribution I want to sample from
Sample N points from the gp posterior for the noise at each location x, this results in a N * n_batch * 1 sample.
forward this to my heteroskedastic model who accepts a diagonal variance filled by the noise levels from my gp posterior
Obtain a joint distribution P( Y | Z, X) * P(Z|X) that I would like to integrate w.r.t Z by sampling.

However, I am relatively new with this, and I can't quite yet grasp what the gpytorch.distributions.MultivariateNormal allows me to do.

I know I can get log_prob which doesn't quite help me, unless some trick, or that pyro could naturally help me if I was currently sampling from a prior over some hyperparameter.

One solution would be to simply set this noise as a prior and let pyro do its work, but not sure how to do this yet, as it would depend conditionally on the inputs.

Anyway, to give an idea of what I'm currenly doing, you can find some sample code below!

Sample code

Here is where essentially I'd like to introduce sampling and my integration:

class myHeteroGP(SingleTaskGP):
    def __init__(self, train_x, train_y, train_y_var, internal_samples=200):
        # model given the noise.
        mean_module = gpytorch.means.ConstantMean()
        covar_module = gpytorch.kernels.ScaleKernel(gpytorch.kernels.RBFKernel())
        noise_gp_covar = gpytorch.kernels.ScaleKernel(gpytorch.kernels.RBFKernel())


        validate_input_scaling(train_X=train_x, train_Y=train_x, train_Yvar=train_y_var)
        #self._validate_tensor_args(X=train_x, Y=train_y, Yvar=train_y_var)
        #self._set_dimensions(train_X=train_x, train_Y=train_y)

        noise_likelihood = GaussianLikelihood(
            noise_prior=SmoothedBoxPrior(-3, 5, 0.5, transform=torch.log),
            #batch_shape=self._aug_batch_shape,
            noise_constraint=GreaterThan(
                MIN_INFERRED_NOISE_LEVEL, transform=None, initial_value=1.0
            ))

        noise_model = SingleTaskGP(
            train_X=train_x,
            train_Y=(train_y_var+1e-06).log(),
            likelihood=noise_likelihood,
            covar_module=noise_gp_covar
            )

        likelihood =  _GaussianLikelihoodBase(HeteroskedasticNoise(noise_model))
        super().__init__(train_X=train_x,
                         train_Y=train_y,
                         likelihood=likelihood,
                         covar_module=covar_module)


        self.to(train_x)

        self.itg_sampler = SobolQMCNormalSampler(internal_samples)
        #print(self.likelihood, self.model)

        self.noise_likelihood = noise_likelihood
        self.noise_model = noise_model


    def forward(self,x):
        # fit noise gp
        noise_mll = gpytorch.mlls.ExactMarginalLogLikelihood(self.noise_likelihood,
                                                    self.noise_model)
        print('fitting noise gp on log(y_var) as function of x')
        botorch.fit.fit_gpytorch_model(noise_mll)

        z_post = self.noise_model.posterior(x)
        z_samples = self.itg_sampler(z_post)
        #samples are `n_samples` x ` n_batch` x `n_dims`
        print(z_samples.shape)

        mean_x = self.mean_module(x)
        covar_x = self.covar_module(x)
        f_samples = MultivariateNormal(mean_x, covar_x)

        res = self.likelihood(f_samples, noise=z_samples.exp())
        # have some way to integrate over the z variable using the samples,
        # while still keeping the variables attached to the graph
        # to allow a fit of the noise hypermaraters

I essentially just want to know if this idea of incorporating some sampling into the main forward loop is a good idea, or whether I should just resort to different solutions altogether !

Answered by Balandat

Dec 17, 2019

Hey, sorry for the delayed response here.

I essentially just want to know if this idea of incorporating some sampling into the main forward loop is a good idea, or whether I should just resort to different solutions altogether !

I don't see any issues with performing sampling in the forward call per se, what I'm worried about is that your use fit_gpytorch_model - that may cause some issues. Since the noise model is in fact part of the model it doesn't seem right to have this in the forward pass (as that is itself called during model fitting). You could of course warm-start things and fit the noise model separately before the full model, and then start from the fitted noise model.

Also, …

View full answer

Balandat · 2019-12-17T04:00:09Z

Balandat
Dec 17, 2019
Collaborator

Hey, sorry for the delayed response here.

I essentially just want to know if this idea of incorporating some sampling into the main forward loop is a good idea, or whether I should just resort to different solutions altogether !

I don't see any issues with performing sampling in the forward call per se, what I'm worried about is that your use fit_gpytorch_model - that may cause some issues. Since the noise model is in fact part of the model it doesn't seem right to have this in the forward pass (as that is itself called during model fitting). You could of course warm-start things and fit the noise model separately before the full model, and then start from the fitted noise model.

Also, as these are exact GPs that you're dealing with here you'll want to make sure that the result returned by the forward pass is a MVN object (this is what the model api requires). The line res = self.likelihood(f_samples, noise=z_samples.exp()) will most likely not work properly - you'd want to evaluate the likelihood for each sample from Z individually and then marginalize the results in some fashion into a single MVN.

0 replies

jacobrgardner · 2019-12-17T04:50:17Z

jacobrgardner
Dec 17, 2019

@ArnoVel Just to add to what Max said, you can accomplish added noise pretty cleanly by just adding it to the diagonal of covar_x.

As Max pointed out, you probably don't need to train the noise GP multiple times: it takes train_x and train_y_var as input, and so doesn't depend on the actual function model in any way. Then, after passing the trained model to your function model, you can do something like the following in forward:

def forward(self, x):
    mean_x = self.mean_module(x)
    covar_x = self.covar_module(x)
    raw_diag_noise = self.noise_model(x).rsample()  # sample from already trained noise model
    diag_noise = raw_diag_noise.exp() # noise values need to be positive -- assume model was fit to e.g. log noises.
    covar_x = covar_x.add_diag(diag_noise)  # Add predicted noise as diagonal to kernel

    return MultivariateNormal(mean_x, covar_x)

If you'd like to avoid learning additional noise via the GaussianLikelihood for the functional model, you can register an Interval(1e-5, 1e-4) constraint to the noise to clamp it to something trivial / small.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Integrating over something else than priors #1276

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 2 comments

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Select a reply

Integrating over something else than priors #1276

ArnoVel Dec 9, 2019

Sample code

Replies: 2 comments

Balandat Dec 17, 2019 Collaborator

jacobrgardner Dec 17, 2019

ArnoVel
Dec 9, 2019

Balandat
Dec 17, 2019
Collaborator

jacobrgardner
Dec 17, 2019