Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Specialize isbanded for StridedMatrix #56487

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

jishnub
Copy link
Contributor

@jishnub jishnub commented Nov 7, 2024

This improves performance, as the loops in istriu and istril may be fused to improve cache-locality.
This also changes the quick-return behavior, and only returns after the check over all the upper or lower bands for a column is complete.

julia> using LinearAlgebra

julia> A = zeros(2, 100);

julia> @btime isdiag($A);
  340.626 ns (0 allocations: 0 bytes) # nightly v"1.12.0-DEV.1571"
  523.105 ns (0 allocations: 0 bytes) # this PR

julia> A = zeros(100, 2);
  340.626 ns (0 allocations: 0 bytes)  # nightly 
  39.155 ns (0 allocations: 0 bytes) # this PR

julia> @btime isdiag($A);

julia> A = zeros(100, 100);

julia> @btime isdiag($A);
  6.616 μs (0 allocations: 0 bytes) # nightly
  2.932 μs (0 allocations: 0 bytes) # this PR

julia> A = diagm(0=>1:100);

julia> A[3,4] = 1;

julia> @btime isdiag($A);
  2.759 μs (0 allocations: 0 bytes) # nightly
  196.156 ns (0 allocations: 0 bytes) # this PR

The fact that the fat-matrix case is slower is unfortunate, and I'm unsure why it's much slower. Perhaps we need a cutoff here to call the fallback implementation?
Edit: the performance issue would be resolved by #56492.

A similar change is added to istriu/istril as well, so that

julia> A = zeros(2, 100);

julia> @btime istriu($A);
  8.083 ns (0 allocations: 0 bytes) # nightly
  9.307 ns (0 allocations: 0 bytes) # this PR

julia> @btime istril($A);
  377.766 ns (0 allocations: 0 bytes) # nightly
  474.221 ns (0 allocations: 0 bytes) # this PR

julia> A = zeros(100, 2);

julia> @btime istriu($A);
  146.085 ns (0 allocations: 0 bytes) # nightly
  38.764 ns (0 allocations: 0 bytes) # this PR

julia> @btime istril($A);
  8.577 ns (0 allocations: 0 bytes) # nightly
  10.888 ns (0 allocations: 0 bytes) # this PR

julia> A = zeros(100, 100);

julia> @btime istriu($A);
  3.290 μs (0 allocations: 0 bytes) # nightly
  1.677 μs (0 allocations: 0 bytes) # this PR

julia> @btime istril($A);
  3.353 μs (0 allocations: 0 bytes) # nightly
  1.661 μs (0 allocations: 0 bytes) # this PR

@jishnub jishnub added performance Must go faster linear algebra Linear algebra arrays [a, r, r, a, y, s] labels Nov 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
arrays [a, r, r, a, y, s] linear algebra Linear algebra performance Must go faster
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant