Fixed posterior for NormalWishart. #28

kaselby · 2018-04-17T22:01:28Z

The posterior calculations for most of this package are based on the paper Conjugate Bayesian analysis of the Gaussian distribution by Kevin Murphy. This paper has several typos related to the Normal-Wishart distribution. The ConjugatePriors package fixes the typos related to the pdf of the Normal-Wishart (though I would note that right now the pdf function will not run without errors for this distribution, as it relies on a function called IsApproxSymmetric that does not appear to exist - but that is a separate issue), but it does not fix the typos in the posterior. The paper cites the posterior update for the scale matrix as being:
T_n = T_0 + S + k_0/k_n * n * (mu_0 - xhat)(mu0-xhat)'
The correct update is
inv(T_n) = inv(T_0) + S + k_0/k_n * n * (mu_0 - xhat)(mu0-xhat)'
See https://stats.stackexchange.com/questions/153241/derivation-of-normal-wishart-posterior/324925#324925 for details.

kaselby · 2018-04-18T14:36:58Z

Just a comment that I have tested this with a MCMC sampler I am using. When I use the existing posterior update, the sampler does not converge, and gives nonsensical results. When I use the update I added here, it converges quickly to the expected answer.

mschauer · 2018-05-02T15:59:52Z

I checked with https://en.wikipedia.org/wiki/Conjugate_prior#Continuous_distributions, so I agree that this is correct. Is there a way that we could interchange last Cholesky und inversion and return a Wishart which is parametrized differently?

kaselby · 2018-05-02T20:54:19Z

After looking at this further, it seems like the T that is intended to be stored in the Normal-Wishart object here is the inverse scale matrix for the Wishart, rather than the scale matrix itself (at least judging by the implementation of the pdf and posterior functions). In this case, the existing posterior update and pdf calculation for the scale matrix should be retained, and they are correct as is. However, T should be inverted before passing it to the Wishart in the rand function. (Also, this should probably be better documented, as it is not the standard parametrization of the Normal-Wishart)

mschauer · 2018-05-02T21:02:25Z

Great! This is the original implementation: https://github.com/JuliaStats/Distributions.jl/pull/140/files Maybe @nfoti remembers

kaselby · 2018-05-02T21:08:10Z

I believe I updated the PR to reflect the inverse scale matrix parametrization. It would be good to have someone look it over though to make sure I didn't miss anything. I believe what I have now is the same as the original formulation before my pull request, except for a change to the rand function.

kaselby · 2018-05-05T15:16:12Z

After checking again, the pdf function actually did need to be changed for this parametrization. I also fixed a syntax error in logpdf, and commented out a line which called the "isapproxsymmetric" function, which is not a function that appears to exist as far as I can tell.

kaselby · 2018-05-05T16:15:10Z

Apologies for making so many commits. I have now gone through and thoroughly tested this and am confident it is now right.

mschauer · 2018-05-05T17:31:28Z

Don't worry. Can we cook up a test which shows that this is right and the version on master is wrong?

kaselby · 2018-05-05T18:40:43Z

I am working on writing something up, I will hopefully post it later today.

mschauer · 2018-05-06T14:10:15Z

One suggestion would be to use in test that the one-dimensional case falls back to the InverseGamma/Gamma distribution, so they should give corresponding results.

kleinschmidt

It would be great to add some tests for this (especially since it seems like there are bugs that could have been caught with tests before...)

kleinschmidt · 2018-05-21T21:16:06Z

src/normalwishart.jl

@@ -62,27 +60,28 @@ function logpdf(nw::NormalWishart, x::Vector{T}, Lam::Matrix{T}) where T<:Real
        hp = 0.5 * p

        # Normalization
-        logp = hp*(log(kappa) - Float64(log2π))
+        logp = hp*(log(kappa) - Float64(log(2*π)))


why get rid of log2pi? It's a StatsFuns constant.

My bad, I think it was giving me errors when I tried to run it. Maybe I just wasn't importing the required package or something.

Yeah I think it's necessary to add StatsFuns (which also provides logmvgamma which is needed elsewhere).

logmvgamma is from Distributions. maybe the old lpgamma is a version from StatsFuns?

I think Distributions gets it from StatsFuns: https://github.com/JuliaStats/StatsFuns.jl/blob/e66dd973650c375bc1739c820e5b96bb5bd000a8/src/misc.jl#L3-L15

kleinschmidt · 2018-05-21T21:21:17Z

src/normalwishart.jl

        logp -= hnu * logdet(Tchol)
        logp -= hnu * p * log(2.)
-        logp -= lpgamma(p, hnu)
+        logp -= logmvgamma(p, hnu)


I'm a little disturbed that this hadn't been fixed until now......

There were other areas too where the current logpdf function crashes. I think somewhere in insupport there is a call to a function that doesn't exist. I didn't fix that in this pull request because it was a separate issue, and I forgot to make a new issue for it.

kleinschmidt · 2018-05-21T21:22:15Z

src/normalwishart.jl

@@ -52,8 +52,6 @@ function logpdf(nw::NormalWishart, x::Vector{T}, Lam::Matrix{T}) where T<:Real
    if !insupport(NormalWishart, x, Lam)
        return -Inf
    else
-        p = length(x)


did you mean to remove this? seems like it's needed below...

alternatively, you could use nw.dim

I think this was an accident

kleinschmidt · 2018-05-21T21:23:02Z

src/normalwishart.jl

-    Lam = rand(Wishart(nw.nu, nw.Tchol))
-    Lsym = PDMat(Symmetric(inv(Lam) ./ nw.kappa))
-    mu = rand(MvNormal(nw.mu, Lsym))
+    V = PDMat(Symmetric(inv(nw.Tchol)))


is it necessary to enforce that these are Symmetric?

Yes. Julia's matrix inversion can lead to some small inaccuracies (of order 1e-16 or 1e-18 or so) and thus inverse matrices that SHOULD be symmetric will be slightly asymmetric. This shouldn't be a problem, but the distributions we are passing these to require symmetric matrices, and enforce a check for symmetricity. Julia's base is_hermitian function does not have a relative tolerance, and thus without some way of enforcing that the inverted matrix is symmetric, this will crash.

We discussed this somewhere, and this was the conclusion, but I cannot find it anymore.

I suppose this might have something to do with the missing isApproxSymmetric function...

The problem is that the symmetry checks are not in the ConjugatePriors package, they are in various places in Distributions. So even if we make our symmetry checks sensitive to a relative tolerance, the checks there won't be. To be honest I have no idea why Julia won't just add a relative tolerance to its is_hermitian function given that the symmetry errors are coming from Julia's own matrix inversion routines, but I was told that that isn't going to happen...

kleinschmidt · 2018-05-21T21:25:06Z

Thanks for looking at this! I agree that this parametrization is very weird. Might also be worth adding a docstring so that' the parametrization is documented in the help file at least.

kaselby · 2018-05-21T23:37:15Z

To be honest I am still not entirely sure how best to approach the problem of how to parametrize this - but no matter how we decide it is best to do it, the current implementation needs to be fixed. I apologize for not getting back to you on the testing issue, I have been very busy recently so haven't had time to look at it.

kleinschmidt · 2018-05-30T16:43:52Z

I agree that the parametrization can be fixed later, but tests are critical before we merge. Even something as simple as calling the functions on a few values would have caught many of these bugs... Thanks for your work on this!

kleinschmidt · 2018-05-30T16:46:01Z

Also, i'm looking the Normal-Inverse-Wishart and seeing a lot of the same problems (with the old function names, checking for symmetry, etc.). Would you be willing to fix those as well? If not let's merge this after adding some minimal tests (at a minimum just to make sure this code can be executed without errors)

kleinschmidt · 2018-05-30T16:54:43Z

ah I found the source of that function (as well as another one that's used by NIW):

https://github.com/JuliaStats/Distributions.jl/blob/2fc223f72a2df4934c47e5701b9afc8119e0448f/src/utils.jl#L117-L137

kaselby · 2018-05-30T17:08:26Z

I'll try to get to this today and look over everything again. Sorry I've been dragging my heels, I've just been caught up in other work.

kleinschmidt · 2018-05-30T17:57:38Z

No worries! I'm in the same boat :)

kleinschmidt · 2018-05-30T19:10:29Z

src/normalwishart.jl

@@ -40,7 +40,7 @@ end
 function insupport(::Type{NormalWishart}, x::Vector{T}, Lam::Matrix{T}) where T<:Real
    return (all(isfinite(x)) &&
           size(Lam, 1) == size(Lam, 2) &&
-           isApproxSymmmetric(Lam) &&
+          #isApproxSymmmetric(Lam) &&


This (and hasCholesky) are in Distributions, so can be fixed by qualifying with Distributions.

maybe worth moving them over here since I don't think they're used in Distributions anymore...

kleinschmidt · 2018-05-30T19:19:28Z

@krylea I have a bit of time to work on this now actually; is it okay if I push more commits to this PR?

kaselby · 2018-05-31T01:42:16Z

Go for it!

kaselby · 2018-06-14T16:30:35Z

After taking another look, I believe the alternate parameterization we were considering here is actually the parameterization for the Normal INVERSE Wishart. I returned to the original parameterization, and added the inversions that were missing in the base code.

mschauer · 2018-09-06T07:30:40Z

What is the status here?

matbesancon · 2019-07-10T16:32:16Z

ping @krylea

mschauer · 2019-08-16T10:31:19Z

@johnczito Would this fall into your area of expertise, be interesting for you?

johnczito · 2019-08-16T12:49:10Z

Sure. I guess the first thing that comes to mind is that NormalWishart should have a field W that simply stores Wishart(nu, inv(Tchol)). Also, why shouldn't logpdf just be something like this?

function logpdf(nw::NormalWishart, x::Vector{T}, Lam::Matrix{T}) where T<:Real
    Lsym = PDMat(Symmetric(inv(Lam) ./ nw.kappa))
    logpdf(Wishart(nw.nu, inv(nw.Tchol)), Lam) + logpdf(MvNormal(nw.mu, Lsym), x)
end

johnczito · 2019-08-16T12:57:43Z

^ That at least suggests a unit test you'd want to implement, in addition to comparing against the NormalGamma.

Fixed posterior for NormalWishart.

13ba68b

mschauer requested a review from kleinschmidt April 18, 2018 07:11

Updated to change parametrization.

a323067

mschauer self-assigned this May 2, 2018

updated pdf function

c14f301

kaselby added 2 commits May 5, 2018 11:18

whoops, thought I included these changes in the earlier commit.

57f52b1

Fixed.

098841d

kleinschmidt reviewed May 21, 2018

View reviewed changes

kleinschmidt reviewed May 30, 2018

View reviewed changes

updated to return to original parameterization.

5e5e4d6

Fixed posterior for NormalWishart. #28

Are you sure you want to change the base?

Fixed posterior for NormalWishart. #28

Conversation

kaselby commented Apr 17, 2018

kaselby commented Apr 18, 2018

mschauer commented May 2, 2018

kaselby commented May 2, 2018

mschauer commented May 2, 2018

kaselby commented May 2, 2018

kaselby commented May 5, 2018

kaselby commented May 5, 2018

mschauer commented May 5, 2018

kaselby commented May 5, 2018

mschauer commented May 6, 2018

kleinschmidt left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kaselby May 21, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kleinschmidt commented May 21, 2018

kaselby commented May 21, 2018

kleinschmidt commented May 30, 2018

kleinschmidt commented May 30, 2018 • edited Loading

kleinschmidt commented May 30, 2018 • edited Loading

kaselby commented May 30, 2018

kleinschmidt commented May 30, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kleinschmidt commented May 30, 2018

kaselby commented May 31, 2018

kaselby commented Jun 14, 2018

mschauer commented Sep 6, 2018

matbesancon commented Jul 10, 2019

mschauer commented Aug 16, 2019

johnczito commented Aug 16, 2019 • edited Loading

johnczito commented Aug 16, 2019

kaselby May 21, 2018 •

edited

Loading

kleinschmidt commented May 30, 2018 •

edited

Loading

kleinschmidt commented May 30, 2018 •

edited

Loading

johnczito commented Aug 16, 2019 •

edited

Loading