Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use cholesky decomposition whenever possible #72

Draft
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

ayushpatnaikgit
Copy link

Loess involves doing weighted least squares many times (at each vertex). The original implementation in Netlib, and the one in this package, use QR decomposition for this purpose.

if VERSION < v"1.7.0-DEV.1188"
    F = qr(us, Val(true))
else
    F = qr(us, ColumnNorm())
end
coefs = F\vs

This is the most expensive part of loess.

I propose using Cholesky factorization instead of QR decomposition.

coefs = cholesky(us' * us) \ (us' * vs)

The idea is that us' * us and us' * vs are pretty small.

Unfortunately, cholesky factorization fails when the matrix being factorized is not positive definite. Hence, I suggest using QR decomposition as a backup using an if condition. Try and catch also work, but I don't like using them.

Using my proposed method will lead to modest gains in computation time, but noticeable improvement in memory allocations. The numerical differences will between the two methods will be small. I did some tests (based on the example in the README) and the differences are < e-10. I think that's reasonable, given loess is an approximation.

Here's the performance comparison on the example in the README.

using BenchmarkTools, Loess
xs = 10 .* rand(100)
ys = sin.(xs) .+ 0.5 * rand(100)

Current implementation:

@btime loess(xs, ys, span=0.5)
164.083 μs (3573 allocations: 1.26 MiB)

Proposed:

@btime loess(xs, ys, span=0.5)
 140.478 μs (3080 allocations: 102.70 KiB)

As far as I know, this trick hasn't been tried before in the context of loess. While all the test cases pass (on Julia 1.9.0), we could be missing out on something. I've had several discussions on this with @ViralBShah and @mousum-github. Instead of opening an issue to discuss it further, I just did the PR, since it's a tiny change in terms of the number of lines.

@codecov-commenter
Copy link

codecov-commenter commented Jun 23, 2023

Codecov Report

Patch coverage: 100.00% and project coverage change: +0.14 🎉

Comparison is base (1b34e35) 89.90% compared to head (2722659) 90.04%.

Additional details and impacted files
@@            Coverage Diff             @@
##           master      #72      +/-   ##
==========================================
+ Coverage   89.90%   90.04%   +0.14%     
==========================================
  Files           2        2              
  Lines         208      211       +3     
==========================================
+ Hits          187      190       +3     
  Misses         21       21              
Impacted Files Coverage Δ
src/Loess.jl 84.25% <100.00%> (+0.44%) ⬆️

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

if VERSION < v"1.7.0-DEV.1188"
F = qr(us, Val(true))
if isposdef(us' * us)
coefs = cholesky(us' * us) \ (us' * vs)
Copy link
Contributor

@ViralBShah ViralBShah Jun 23, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With isposdef this way, you are doing cholesky twice. You really should only do it once, and check the error and if not positive definite, fall back to QR.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also is this how people do it in literature? Forming the product squares the condition number, right?

Copy link
Author

@ayushpatnaikgit ayushpatnaikgit Jun 23, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With isposdef this way, you are doing cholesky twice. You really should only do it once, and check the error and if not positive definite, fall back to QR.

Do you mean like this:

coefs = Matrix{Real}(undef, size(us, 2), 1)
try
    coefs = cholesky(us' * us) \ (us' * vs)
catch PosDefException
    if VERSION < v"1.7.0-DEV.1188"
        F = qr(us, Val(true))
    else
        F = qr(us, ColumnNorm())
    end
    coefs = F \ vs
end

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also is this how people do it in literature? Forming the product squares the condition number, right?

@mousum-github what do you think?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I would catch the exception and trigger the QR. I believe doing QR should be numerically better than squaring and doing Cholesky, in general. @andreasnoack ?

Copy link
Author

@ayushpatnaikgit ayushpatnaikgit Jun 24, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is indeed better.
Here are benchmarks on a different xs & ys.
Current:

@btime loess(xs, ys, span=0.5)
  150.481 μs (3037 allocations: 1.25 MiB)

If condition

@btime loess(xs, ys, span=0.5)
  125.853 μs (2544 allocations: 94.33 KiB)

Exception handling

@btime loess(xs, ys, span=0.5)
  116.999 μs (2527 allocations: 91.41 KiB)

Copy link

@mousum-github mousum-github Jun 25, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also is this how people do it in literature? Forming the product squares the condition number, right?

@mousum-github what do you think?

@ayushpatnaikgit - Please have a look at these examples.
JuliaStats/GLM.jl#507 (comment)
In summary, up to a certain condition number of the design matrix say 1e4, both Cholesky and QR produce the same estimates, and then up to another big condition number of the design matrix say 1e7, Cholesky produces less accurate estimates whereas QR produces accurate estimates. Then up to another big condition number say 1e15, Cholesky raises PosDefException, whereas QR still produces accurate estimates. And when the condition number is above say 1.e16 both Cholesky and QR detect rank deficiency.
As per the above example, the TRY CATCH statement works well when the condition number of the design matrix is <= 1e4 or > 1e8 but it fails to produce accurate estimates when the condition is in between (1.e4, 1.e8).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ViralBShah should we add a cholesky/qr key word argument, just as we have done in GLM.jl?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The default will be Cholesky, and we need to have some examples showing when we should use QR.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, Mousum. I'll figure out a case where cholesky doesn't work, and then we can take the discussion from there.

@ayushpatnaikgit ayushpatnaikgit marked this pull request as draft July 2, 2023 16:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants