Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't materialize when adding/subtracting an Array #456

Merged
merged 1 commit into from
Oct 16, 2023

Conversation

jishnub
Copy link
Contributor

@jishnub jishnub commented Oct 13, 2023

This improves performance and reduces allocation:

julia> S = sprand(1000, 1000, 0.04);

julia> @btime $S + $(Array(S));
  2.107 ms (4 allocations: 15.26 MiB) # main
  1.539 ms (2 allocations: 7.63 MiB) # PR

@codecov
Copy link

codecov bot commented Oct 13, 2023

Codecov Report

Merging #456 (455e6a9) into main (0f8bbda) will increase coverage by 0.06%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##             main     #456      +/-   ##
==========================================
+ Coverage   85.43%   85.50%   +0.06%     
==========================================
  Files          13       13              
  Lines        8733     8773      +40     
==========================================
+ Hits         7461     7501      +40     
  Misses       1272     1272              
Files Coverage Δ
src/sparsematrix.jl 95.75% <100.00%> (+0.06%) ⬆️

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@ViralBShah
Copy link
Member

Would this be considered a breaking change? I suspect it could change the behaviour of many things in the package ecosystem.

@jishnub
Copy link
Contributor Author

jishnub commented Oct 14, 2023

I'm afraid I don't know enough to comment, but wouldn't this work as long as subtypes of AbstractSparseMatrixCSC implement rowvals, nonzeros and nzrange? The result is materialized in either case, and only an intermediate materialization is elided.

@dkarrasch
Copy link
Member

I agree with @jishnub: I don't see how this would be breaking. It uses what seems to be the "interface" of AbstractSparseMatrixCSC, and the methods are designed for Base's Arrays, so that shouldn't interfere with other array types from the ecosystem. Finally, the return type is Matrix, so again nothing generic.

@ViralBShah
Copy link
Member

I suppose breaking is not the right word - but perhaps the issue is that people may be relying on this behaviour and performance characteristics may change in their codes.

I am ok with merging this - just wanted to raise the issue for discussion.

@SobhanMP
Copy link
Member

Can you benchmark it when the array is almost full (sprand(1000, 1000, 0.99))?
What does the Ref in C = Ref(zero(eltype(A))) .+ B do?
p.s.
Nice observation!

@jishnub
Copy link
Contributor Author

jishnub commented Oct 15, 2023

To answer the Ref point: this ensures that the zero element is treated as a scalar and added elementwise to the Array, even when it is an array itself.

@jishnub
Copy link
Contributor Author

jishnub commented Oct 15, 2023

Using a highly filled array:

julia> S = sprand(1000, 1000, 0.99);

julia> @btime $S + $(Array(S));
  3.347 ms (4 allocations: 15.26 MiB) # main
  2.912 ms (2 allocations: 7.63 MiB) # PR

For larger matrices:

julia> S = sprand(5000, 5000, 0.99);

julia> @btime $S + $(Array(S));
  203.371 ms (4 allocations: 381.47 MiB) # main
  123.868 ms (2 allocations: 190.73 MiB) # PR

@ViralBShah
Copy link
Member

ViralBShah commented Oct 16, 2023

The perf concern is not with the implementation here, but that returning a different matrix type will have potential performance impact on code that use the results of these operations. Of course, there is no way to be able to tell what the impact will be - but if we do this, we have to communicate it very clearly.

@dkarrasch
Copy link
Member

We are not returning a different matrix type, we return the same matrix type as before. Users won't see any difference except for better performance.

@ViralBShah
Copy link
Member

We are not returning a different matrix type, we return the same matrix type as before. Users won't see any difference except for better performance.

Oops never mind.

@ViralBShah ViralBShah merged commit 3582898 into main Oct 16, 2023
8 checks passed
@ViralBShah ViralBShah deleted the jishnub/arrayaddsub branch October 16, 2023 12:15
@SobhanMP
Copy link
Member

@jishnub Thanks for the benchmarks and the explanation!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants