Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add @inbounds to iterate method for AbstractArray: 3x speedup #57112

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

nsajko
Copy link
Contributor

@nsajko nsajko commented Jan 21, 2025

No description provided.

@nsajko
Copy link
Contributor Author

nsajko commented Jan 21, 2025

Timings:

struct Vec{T} <: AbstractVector{T}
    mem::Memory{T}
end
Base.IndexStyle(::Type{<:Vec}) = IndexLinear()
Base.@propagate_inbounds Base.getindex(A::Vec, i::Int) = A.mem[i]
Base.size(a::Vec) = size(a.mem)

function f(v)
    z = eltype(v)(false)
    for x  v
        z += x
    end
    z
end

using BenchmarkTools
using Random: rand!

const vec = let m = Memory{Int8}(undef, 4096)
    rand!(m)
    Vec(m)
end

@btime f(vec) setup=(rand!(vec.mem);)
# 91.502 ns (0 allocations: 0 bytes)

# redefine the methods: placebo
function Base.iterate(A::AbstractArray, state=(eachindex(A),))
    y = iterate(state...)
    y === nothing && return nothing
    A[y[1]], (state[1], Base.tail(y)...)
end

@btime f(vec) setup=(rand!(vec.mem);)
# 91.764 ns (0 allocations: 0 bytes)

# redefine the methods: add `@inbounds`
function Base.iterate(A::AbstractArray, state=(eachindex(A),))
    y = iterate(state...)
    y === nothing && return nothing
    @inbounds A[y[1]], (state[1], Base.tail(y)...)
end

@btime f(vec) setup=(rand!(vec.mem);)
# 31.134 ns (0 allocations: 0 bytes)

Info:

julia> versioninfo()
Julia Version 1.12.0-DEV.1906
Commit 17402876ec2 (2025-01-18 07:33 UTC)
Build Info:
  Official https://julialang.org release
Platform Info:
  OS: Linux (x86_64-linux-gnu)
  CPU: 8 × AMD Ryzen 3 5300U with Radeon Graphics
  WORD_SIZE: 64
  LLVM: libLLVM-18.1.7 (ORCJIT, znver2)
  GC: Built with stock GC
Threads: 1 default, 0 interactive, 1 GC (on 8 virtual cores)

@nsajko nsajko added performance Must go faster arrays [a, r, r, a, y, s] iteration Involves iteration or the iteration protocol labels Jan 21, 2025
@jakobnissen
Copy link
Contributor

I believe this goes against an existing principle to not add inbounds to generic code - see e.g. the docs for inbounds.
The issue is that this change will turn an ordinary bug in eachindex into undefined behaviour. This breaks the principle that undefined behaviour should only occur due to a locally wrongly placed unsafe operation.

@nsajko
Copy link
Contributor Author

nsajko commented Jan 21, 2025

IMO a correct eachindex is a pretty low bar 🤷

And there's always --check-bounds=yes, turned on by default during testing.

@nsajko
Copy link
Contributor Author

nsajko commented Jan 21, 2025

There's another drawback to this PR, though: @inbounds thrashes the effects. This might matter for some array type that relies on constant folding a lot, I guess.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
arrays [a, r, r, a, y, s] iteration Involves iteration or the iteration protocol performance Must go faster
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants