-
Notifications
You must be signed in to change notification settings - Fork 79
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
wrong calculation of covariance matrix in function StatsAPI.vcov(fit::LsqFitResult) #255
Comments
side note: I noticed that this line Line 261 in 8f8c18a
can most probably removed since it is not used. |
I just tested with the KDE program LabPlot:
also for weighted fits. |
Thank you, this is certainly the right place to report the issue. Thank you for taking the time. I am the maintainer of this package and I also developed some of it. Most of it is something I moved out of Optim because I didn't think it was good to have all the statistics part of Optim. I think now that the LM method should maybe have stayed but I am digressing. I think you may be right. I also think we had this discussion many times before in what are now closed issues. I will have to go back and consult. I think there are some nuances here that depend on assumptions such was where the weights are put. Are you weighting the residuals or are you weighting the squared residuals. That matters in the end. So I am open to this change, but I want to make sure that we have references and the math in order, because as I said I think this discussion has been had many times before with diverging opinions and I honestly never really use NLS myself :) |
Thanks for the reply. However, I am bit confused. I don't know about a package named Optim. I just googled how to perform a NLSF with Julia, found LsqFit, installed it and gave it a try. As a physicist I have to perform NLSFs all the time. I hope I made claer where I see the bug: when one has weights (meaning data errors) LsqFit does not take the MSE into account. If it would, it would deliver the same results I get with other programs. To reproduce, here is my test data
this is the fit model: Here is a Julia script to reproduce: |
Yes, I understand. My imperfect memory just told me that it was set up in that way on purpose. I just can't remember the details :) So I have to read up on it. I remember it as such that the weights are included in the residuals so they carry over to the Jacobian so they indeed not ignored, but just incorporated into J. But I have to check. |
Okay, I looked at your code example. One thing that does strike me is that you input the weights as inverse variances, but I believe they should be inverse standard deviations. If you input a matrix instead it is interpreted as a variance-covariance matrix. So vector weights -> standard deviation interpretation and matrix weights -> variance-covariance interpretation. |
but I set
But I want the variance-covariance matrix to be computed. I don't know it yet. I just have data and know the error of the data and perform a fit. As result I need the errors of the found fit parameters. I think this is the most standard use case. |
from #85
This statement is misleading. Sure, the Jacobian must contain the weighting and LsqFit does this correctly. The point is why the MSE is not used.
You can give this a try:
|
In #103 (comment) I think I made it clear:
I will make a PR with the way I think the fix should be then you can give this to e.g. a mathematician. |
@gerlachs is one of the main LabPlot (https://github.com/KDE/labplot) developers, maybe he can give feedback since my PR would change LsqFit so that it delivers the same output as LabPlot. |
OK, the proof that there is a bug and that my PR fixes it is simple: when you make a non-weighted fit you must get the same result than when making a weighted fit but setting the weights to 1.0. (This was already mentioned here: #103 (comment)) Here is a small Julia script with the proof: |
I do understand the issue (and thank you for the PR). What I was trying to state above is that as far as I remember, the conclusion in the above links were that the behavior is intentional (you can argue what you want the software to do of course), because if you tell me what the uncertainty is at each collected sample, then there is no variance component to estimate and implicitly the variance is set to 1 and your weights will change the variance (sigma_i) accordingly in the calculations. I suppose it has to do with the way you wish to use the weights. Happy to hear from @gerlachs on this. I am not opposed to your PR or your comments I am just trying to explain what I seem to remember was the guiding principle way-back-when... and again, I am not a user of NLS I just happened to become the owner of the code in 2017 :) |
I am a beginner with Julia and tried it out by fitting data.
I got unexpected results for vcov(fit) and had a look in the source code.
There:
LsqFit.jl/src/curve_fit.jl
Line 256 in 8f8c18a
I noticed 2 issues:
This is strange because both methods deliver the same results but the QR decomposition might be more stable numerically
https://www.mathworks.com/help/stats/nlinfit.html#btk7ign-CovB
The second point is my main problem because the errors of the fit model parameters were larger than expected.
I think it is a bug to omit the MSE. When there are weights, the MSE includes them already, meaning the MSE does not become 1 because of the presence of weights. And therefore it cannot just be omitted.
I the tutorial I see there correctly:
https://julianlsolvers.github.io/LsqFit.jl/latest/tutorial/#Weighted-Least-Squares-1
Cov(γ∗)=σ²[J′WJ]^−1
and σ² is the MSE
there you also write
But this is incorrect because the dimensions won't fit and I get the error:
DimensionMismatch
and looking at the source code the calculations iscovar = covar = inv(J' * J)
so the weights are ignored.
I use Julia 1.11.1 and LsqFit v0.15.0
p.s. as a newbie I might report the error at the wrong place, please point me then to the correct place.
p.s. the LsqFit docs:
https://julianlsolvers.github.io/LsqFit.jl/latest/tutorial
I see the command ''estimate_var'' but this is marked deprecated.
The text was updated successfully, but these errors were encountered: