Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Removing allocations for repeated use of object_and_preserve #4

Open
mmiller-max opened this issue Mar 16, 2021 · 6 comments
Open

Removing allocations for repeated use of object_and_preserve #4

mmiller-max opened this issue Mar 16, 2021 · 6 comments

Comments

@mmiller-max
Copy link

When using CheapThreads.jl, I get allocations when new References are created in object_and_preserve. This makes sense, but I'd really like to do away with this is possible as my aim is to use CheapThreads.jl within DifferentialEquations.jl and I don't want the allocs to build up. In my case the arg that is being referenced is a struct, but you also get allocations for other !isbits types, e.g.:

julia> f1(arg) = @time StrideArraysCore.object_and_preserve(arg)
julia> f1(1)
  0.000000 seconds
(1, nothing)

julia> f1("string")
  0.000000 seconds (1 allocation: 16 bytes)
(StrideArraysCore.Reference{String}("string"), StrideArraysCore.Reference{String}("string"))

julia> struct TestStruct
       a::Vector{Int64}
       end

julia> ts = TestStruct([1])
TestStruct([1])

julia> f1(ts)
  0.000000 seconds (1 allocation: 16 bytes)
(StrideArraysCore.Reference{TestStruct}(TestStruct([1])), StrideArraysCore.Reference{TestStruct}(TestStruct([1])))

I can see two possible solutions:

1. Allow users to create their own References

This just needs object_and_preserve to be extended like

@inline object_and_preserve(r::Reference) = (r, r)

When added, it works like so:

julia> ref = StrideArraysCore.Reference("string")
StrideArraysCore.Reference{String}("string")

julia> f1(ref)
  0.000000 seconds
(StrideArraysCore.Reference{String}("string"), StrideArraysCore.Reference{String}("string"))

julia> ref = StrideArraysCore.Reference(ts)
StrideArraysCore.Reference{TestStruct}(TestStruct([1]))

julia> f1(ref)
  0.000000 seconds
(StrideArraysCore.Reference{TestStruct}(TestStruct([1])), StrideArraysCore.Reference{TestStruct}(TestStruct([1])))

This seems to work fine with batch from CheapThreads.jl.

2. Encourage users to extend object_and_preserve, store! and load for their structs

For example:

@inline function ThreadingUtilities.load(p::Ptr{UInt}, ::Type{TestMutableStruct}, i)
    i, ref = ThreadingUtilities.load(p, Ptr{TestMutableStruct}, i)
    i, Base.unsafe_pointer_to_objref(ref)::TestMutableStruct
end
@inline function ThreadingUtilities.store!(p::Ptr{UInt}, r::TestMutableStruct, i)
    ThreadingUtilities.store!(p + i, reinterpret(UInt, pointer_from_objref(r)))
    i + sizeof(UInt)
end
@inline StrideArraysCore.object_and_preserve(r::TestMutableStruct) = (r, r)

This seems to be fine with CheapThreads, but I don't know if there are implications of doing this.

julia> arg_test(args, start, stop) = nothing

julia> mutable struct TestMutableStruct
       a::String
       end

julia> tms = TestMutableStruct("a")
TestMutableStruct("a")

julia>  u = zeros(100)
...
julia> @time batch(arg_test, (10, 8), u, tms)
  0.000003 seconds

The above extensions only work for mutable structs because of pointer_from_objref.

I'm interested to know what people's thoughts are on the best way forward for this!

@chriselrod
Copy link
Member

Why not 1 & 2? I.e., let users decide what works best for their use case.

@chriselrod
Copy link
Member

chriselrod commented Mar 17, 2021

Base.unsafe_pointer_to_objref(ref)::TestMutableStruct

This or something like it seems like a good solution.

julia> x = Ref(("hi", "world", 9))
Base.RefValue{Tuple{String, String, Int64}}(("hi", "world", 9))

julia> p = Base.unsafe_convert(Ptr{typeof(x[])}, x)
Ptr{Tuple{String, String, Int64}} @0x00007f2a74688db0

julia> tsptr(x::Ptr{T}) where {T} = (Base.unsafe_pointer_to_objref(x)::Base.RefValue{T})[]
tsptr (generic function with 1 method)

julia> tsptr(p)
("hi", "world", 9)

julia> @benchmark tsptr($p)
BenchmarkTools.Trial:
  memory estimate:  0 bytes
  allocs estimate:  0
  --------------
  minimum time:     1.765 ns (0.00% GC)
  median time:      1.772 ns (0.00% GC)
  mean time:        1.798 ns (0.00% GC)
  maximum time:     12.976 ns (0.00% GC)
  --------------
  samples:          10000
  evals/sample:     1000

I don't know if there is a way to avoid the type instability all together by using something other than Any:

unsafe_pointer_to_objref(x::Ptr) = ccall(:jl_value_ptr, Any, (Ptr{Cvoid},), x)

e.g.

tsptr2(x::Ptr{T}) where {T} = ccall(:jl_value_ptr, Ref{Base.RefValue{T}}, (Ptr{Cvoid},), x)[]
tsptr2(p)

yields

julia> tsptr2(p)
("hi", "world", 9)

julia> @code_warntype tsptr2(p)
MethodInstance for tsptr2(::Ptr{Tuple{String, String, Int64}})
  from tsptr2(x::Ptr{T}) where T in Main at REPL[13]:1
Static Parameters
  T = Tuple{String, String, Int64}
Arguments
  #self#::Core.Const(tsptr2)
  x::Ptr{Tuple{String, String, Int64}}
Body::Tuple{String, String, Int64}
1%1 = Core.apply_type(Main.Ptr, Main.Cvoid)::Core.Const(Ptr{Nothing})
│   %2 = Base.cconvert(%1, x)::Ptr{Tuple{String, String, Int64}}%3 = Core.apply_type(Main.Ptr, Main.Cvoid)::Core.Const(Ptr{Nothing})
│   %4 = Base.unsafe_convert(%3, %2)::Ptr{Nothing}%5 = $(Expr(:foreigncall, :(:jl_value_ptr), Ref{Base.RefValue{T}}, svec(Ptr{Nothing}), 0, :(:ccall), :(%4), :(%2)))::Base.RefValue{Tuple{String, String, Int64}}%6 = Base.getindex(%5)::Tuple{String, String, Int64}
└──      return %6

It loads a Base.RefValue{T}, which is why we pass Ref{Base.RefValue{T}} as the return type.
In your example, it'd be Ref{TestMutableStruct}.

It's basically the same fast.

@mmiller-max
Copy link
Author

Sorry for the delay in replying. Yeah enabling both sounds good.

I'm intrigued to see how much of a difference removing the type instability makes to performance - will give it a go.

@chriselrod
Copy link
Member

I'm intrigued to see how much of a difference removing the type instability makes to performance - will give it a go.

I saw basically no difference (you can't get much faster than 1.8 ns), but would do it on principle anyway. Would be easier on the branch predictor.

@mmiller-max
Copy link
Author

Yep also saw no difference for structs as you expected.

For mutable structs, tsptr doesn't work:

julia> mutable struct TestMutableStruct
       a::String
       end

julia> tms = TestMutableStruct("a")
TestMutableStruct("a")

julia> ref = Ref(tms)
Base.RefValue{TestMutableStruct}(TestMutableStruct("a"))

julia> p = Base.unsafe_convert(Ptr{typeof(ref[])}, ref)
Ptr{TestMutableStruct} @0x000000012a49d140

julia> Base.unsafe_pointer_to_objref(p)
TestMutableStruct("a")

julia> tsptr(p)
ERROR: TypeError: in typeassert, expected Base.RefValue{TestMutableStruct}, got a value of type TestMutableStruct
Stacktrace:
 [1] tsptr(::Ptr{TestMutableStruct}) at ./REPL[21]:1
 [2] top-level scope at REPL[22]:1

Is this expected? Seems that :jl_value_ptr returns the mutable struct in this case.

@chriselrod
Copy link
Member

chriselrod commented Mar 25, 2021

julia> tsptr2(x::Ptr{T}) where {T} = ccall(:jl_value_ptr, Ref{TestMutableStruct}, (Ptr{Cvoid},), x)
tsptr2 (generic function with 1 method)

julia> tsptr2(p)
TestMutableStruct("a")

You need to specify Ref{M}, where M is the mutable struct wrapper type. In my example, that was Base.RefValue{T}, because I wrapped the type in a Base.RefValue:

julia> x = Ref(("hi", "world", 9))
Base.RefValue{Tuple{String, String, Int64}}(("hi", "world", 9))

But now in your example, M == TestMutableStruct, because you're wrapping with TestMutableStruct instead of by calling Ref (which creases a Base.RefValue`).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants