-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixed posterior for NormalWishart. #28
base: master
Are you sure you want to change the base?
Conversation
Just a comment that I have tested this with a MCMC sampler I am using. When I use the existing posterior update, the sampler does not converge, and gives nonsensical results. When I use the update I added here, it converges quickly to the expected answer. |
I checked with https://en.wikipedia.org/wiki/Conjugate_prior#Continuous_distributions, so I agree that this is correct. Is there a way that we could interchange last Cholesky und inversion and return a Wishart which is parametrized differently? |
After looking at this further, it seems like the T that is intended to be stored in the Normal-Wishart object here is the inverse scale matrix for the Wishart, rather than the scale matrix itself (at least judging by the implementation of the pdf and posterior functions). In this case, the existing posterior update and pdf calculation for the scale matrix should be retained, and they are correct as is. However, T should be inverted before passing it to the Wishart in the rand function. (Also, this should probably be better documented, as it is not the standard parametrization of the Normal-Wishart) |
Great! This is the original implementation: https://github.com/JuliaStats/Distributions.jl/pull/140/files Maybe @nfoti remembers |
I believe I updated the PR to reflect the inverse scale matrix parametrization. It would be good to have someone look it over though to make sure I didn't miss anything. I believe what I have now is the same as the original formulation before my pull request, except for a change to the rand function. |
After checking again, the pdf function actually did need to be changed for this parametrization. I also fixed a syntax error in logpdf, and commented out a line which called the "isapproxsymmetric" function, which is not a function that appears to exist as far as I can tell. |
Apologies for making so many commits. I have now gone through and thoroughly tested this and am confident it is now right. |
Don't worry. Can we cook up a test which shows that this is right and the version on master is wrong? |
I am working on writing something up, I will hopefully post it later today. |
One suggestion would be to use in test that the one-dimensional case falls back to the InverseGamma/Gamma distribution, so they should give corresponding results. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be great to add some tests for this (especially since it seems like there are bugs that could have been caught with tests before...)
@@ -62,27 +60,28 @@ function logpdf(nw::NormalWishart, x::Vector{T}, Lam::Matrix{T}) where T<:Real | |||
hp = 0.5 * p | |||
|
|||
# Normalization | |||
logp = hp*(log(kappa) - Float64(log2π)) | |||
logp = hp*(log(kappa) - Float64(log(2*π))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why get rid of log2pi
? It's a StatsFuns constant.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My bad, I think it was giving me errors when I tried to run it. Maybe I just wasn't importing the required package or something.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I think it's necessary to add StatsFuns (which also provides logmvgamma
which is needed elsewhere).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
logmvgamma is from Distributions. maybe the old lpgamma is a version from StatsFuns?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think Distributions gets it from StatsFuns: https://github.com/JuliaStats/StatsFuns.jl/blob/e66dd973650c375bc1739c820e5b96bb5bd000a8/src/misc.jl#L3-L15
logp -= hnu * logdet(Tchol) | ||
logp -= hnu * p * log(2.) | ||
logp -= lpgamma(p, hnu) | ||
logp -= logmvgamma(p, hnu) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a little disturbed that this hadn't been fixed until now......
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There were other areas too where the current logpdf function crashes. I think somewhere in insupport there is a call to a function that doesn't exist. I didn't fix that in this pull request because it was a separate issue, and I forgot to make a new issue for it.
@@ -52,8 +52,6 @@ function logpdf(nw::NormalWishart, x::Vector{T}, Lam::Matrix{T}) where T<:Real | |||
if !insupport(NormalWishart, x, Lam) | |||
return -Inf | |||
else | |||
p = length(x) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
did you mean to remove this? seems like it's needed below...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
alternatively, you could use nw.dim
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this was an accident
src/normalwishart.jl
Outdated
Lam = rand(Wishart(nw.nu, nw.Tchol)) | ||
Lsym = PDMat(Symmetric(inv(Lam) ./ nw.kappa)) | ||
mu = rand(MvNormal(nw.mu, Lsym)) | ||
V = PDMat(Symmetric(inv(nw.Tchol))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is it necessary to enforce that these are Symmetric?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. Julia's matrix inversion can lead to some small inaccuracies (of order 1e-16 or 1e-18 or so) and thus inverse matrices that SHOULD be symmetric will be slightly asymmetric. This shouldn't be a problem, but the distributions we are passing these to require symmetric matrices, and enforce a check for symmetricity. Julia's base is_hermitian function does not have a relative tolerance, and thus without some way of enforcing that the inverted matrix is symmetric, this will crash.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We discussed this somewhere, and this was the conclusion, but I cannot find it anymore.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suppose this might have something to do with the missing isApproxSymmetric
function...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem is that the symmetry checks are not in the ConjugatePriors package, they are in various places in Distributions. So even if we make our symmetry checks sensitive to a relative tolerance, the checks there won't be. To be honest I have no idea why Julia won't just add a relative tolerance to its is_hermitian function given that the symmetry errors are coming from Julia's own matrix inversion routines, but I was told that that isn't going to happen...
Thanks for looking at this! I agree that this parametrization is very weird. Might also be worth adding a docstring so that' the parametrization is documented in the help file at least. |
To be honest I am still not entirely sure how best to approach the problem of how to parametrize this - but no matter how we decide it is best to do it, the current implementation needs to be fixed. I apologize for not getting back to you on the testing issue, I have been very busy recently so haven't had time to look at it. |
I agree that the parametrization can be fixed later, but tests are critical before we merge. Even something as simple as calling the functions on a few values would have caught many of these bugs... Thanks for your work on this! |
Also, i'm looking the Normal-Inverse-Wishart and seeing a lot of the same problems (with the old function names, checking for symmetry, etc.). Would you be willing to fix those as well? If not let's merge this after adding some minimal tests (at a minimum just to make sure this code can be executed without errors) |
ah I found the source of that function (as well as another one that's used by NIW): |
I'll try to get to this today and look over everything again. Sorry I've been dragging my heels, I've just been caught up in other work. |
No worries! I'm in the same boat :) |
src/normalwishart.jl
Outdated
@@ -40,7 +40,7 @@ end | |||
function insupport(::Type{NormalWishart}, x::Vector{T}, Lam::Matrix{T}) where T<:Real | |||
return (all(isfinite(x)) && | |||
size(Lam, 1) == size(Lam, 2) && | |||
isApproxSymmmetric(Lam) && | |||
#isApproxSymmmetric(Lam) && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This (and hasCholesky
) are in Distributions, so can be fixed by qualifying with Distributions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe worth moving them over here since I don't think they're used in Distributions anymore...
@krylea I have a bit of time to work on this now actually; is it okay if I push more commits to this PR? |
Go for it! |
After taking another look, I believe the alternate parameterization we were considering here is actually the parameterization for the Normal INVERSE Wishart. I returned to the original parameterization, and added the inversions that were missing in the base code. |
What is the status here? |
ping @krylea |
@johnczito Would this fall into your area of expertise, be interesting for you? |
Sure. I guess the first thing that comes to mind is that function logpdf(nw::NormalWishart, x::Vector{T}, Lam::Matrix{T}) where T<:Real
Lsym = PDMat(Symmetric(inv(Lam) ./ nw.kappa))
logpdf(Wishart(nw.nu, inv(nw.Tchol)), Lam) + logpdf(MvNormal(nw.mu, Lsym), x)
end |
^ That at least suggests a unit test you'd want to implement, in addition to comparing against the |
The posterior calculations for most of this package are based on the paper Conjugate Bayesian analysis of the Gaussian distribution by Kevin Murphy. This paper has several typos related to the Normal-Wishart distribution. The ConjugatePriors package fixes the typos related to the pdf of the Normal-Wishart (though I would note that right now the pdf function will not run without errors for this distribution, as it relies on a function called IsApproxSymmetric that does not appear to exist - but that is a separate issue), but it does not fix the typos in the posterior. The paper cites the posterior update for the scale matrix as being:
T_n = T_0 + S + k_0/k_n * n * (mu_0 - xhat)(mu0-xhat)'
The correct update is
inv(T_n) = inv(T_0) + S + k_0/k_n * n * (mu_0 - xhat)(mu0-xhat)'
See https://stats.stackexchange.com/questions/153241/derivation-of-normal-wishart-posterior/324925#324925 for details.