Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Loop over Iterators.rest in _foldl_impl #56492

Open
wants to merge 2 commits into
base: master
Choose a base branch
from
Open

Conversation

jishnub
Copy link
Contributor

@jishnub jishnub commented Nov 7, 2024

For reasons that I don't understand, this improves performance in mapreduce in the following example:

julia> function g(A)
           for col in axes(A,2)
               mapreduce(iszero, &, view(A, UnitRange(axes(A,1)), col), init=true) || return false
           end
           return true
       end
g (generic function with 2 methods)

julia> A = zeros(2, 10000);

julia> @btime g($A);
  28.021 μs (0 allocations: 0 bytes) # nightly v"1.12.0-DEV.1571"
  12.462 μs (0 allocations: 0 bytes) # this PR

julia> A = zeros(1000,1000);

julia> @btime g($A);
  372.080 μs (0 allocations: 0 bytes) # nightly
  321.753 μs (0 allocations: 0 bytes) # this PR

It would be good to understand what the underlying issue is, as the two seem equivalent to me. Perhaps this form makes it clear that it's not, in fact, an infinite loop?

@jishnub jishnub added the performance Must go faster label Nov 7, 2024
@JeffBezanson
Copy link
Member

Huh, that's interesting (and satisfying)! The lowering of for is careful to use a branch structure that LLVM prefers, so this kind of thing is possible but I'm surprised it's still that significant. In any case the code is nicer!

@@ -48,10 +48,8 @@ function _foldl_impl(op::OP, init, itr) where {OP}
y = iterate(itr)
Copy link
Member

@vtjnash vtjnash Nov 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In theory this change is essentially identical except that this now unrolls two iterations instead of one, so the comment above now needs to be adjusted, since it states the loop is only unrolled once

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance Must go faster
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants