revising sorting APIs (2.0) #44876
Replies: 9 comments 17 replies
-
Yes. Only one or two methods of I have mixed feelings about orderings. It's nice/necessary to have a way to make keyword arguments disappear in the implementation of e.g QuickSort, but supporting orderings is redundant with supporting AbstractVectors of arbitrary eltype + multiple dispatch, which we already have. If we could somehow attach the ordering to the array without sacrificing performance, then we could eliminate the need for Line 750 in badad9d And let our sorting algorithms simply call isless .
A toy (broken) implementation using reinterpret: struct WithIsless{T, L}
data::T
isless::L
end
Base.isless(x::T, y::T) where {T <: WithIsless} = x.isless(x.data, y.data)
function sort!(x, lt=identity)
_sort!(reinterpret(WithIsless{eltype(x), typeof(lt)}, x))
x
end |
Beta Was this translation helpful? Give feedback.
-
I think we could have a (mostly) linear pipeline that begins with any exported functions (
The linear pipeline is
Which can be implemented like so function _sort!(x::AbstractArray, alg::Algorithm, ::Copy, o::Order; maybecopy, kw...)
_sort!(maybecopy(x), alg, ArrayDims(), kw...)
end
function _sort!(x::AbstractArray, alg::Algorithm, phase::ArrayDims, o::Order; dims, kw...)
for part in slices_of(x, dims)
_sort!(part, alg, phase, o; kw...)
end
end
function _sort!(x::AbstractVector, alg::Algorithm, ::ArrayDims, o::Order; kw...)
_sort!(x, alg, ::SmallCheck1(), o; kw...)
end
function _sort!(x::AbstractVector, alg::Algorithm, ::SmallCheck1, o::Order; kw...)
length(x) < 10 && return _sort!(x, alg, Small(), o; kw...)
_sort!(x, alg, PartitionMissings(), o; kw...)
end
function _sort!(x::AbstractVector, alg::Algorithm, ::PartitionMissings, o::Order; kw...)
_sort!(try_to_do_missing_optimization!(x, o), alg, PartitionFloats(), o; kw...)
end
function _sort!(x::AbstractVector, alg::Algorithm, ::PartitionFloats, o::Order; kw...)
_sort!(try_to_do_fpsort!(x, o), alg, SmallCheck2(), o; kw...)
end
... This is extensible with struct SmallCheckBetweenMissingAndFloat <: Algorithm end
function Base._sort!(x::AbstractVector, alg::SmallCheckBetweenMissingAndFloat, ::PartitionFloats, o::Order; kw...)
length(x) < 15 && return _sort!(x, alg, Small(), o, kw...)
_sort!(x, Default(), PartitionFloats(), o; kw...)
end Linearization maximizes code reuse. Extensibility might make it easier to turn on algorithms and optimizations with high-level dependencies (e.g. view, rand, OffsetVector) after stdlibs are loaded. |
Beta Was this translation helpful? Give feedback.
-
Some links to keep track of relevant to API decisions:
I like the orthogonality of |
Beta Was this translation helpful? Give feedback.
-
I just underwent an effort to write a custom sortperm! to implement a counting sort which backs off to MergeSort when there are too many unique elements. The reason for choosing this function is because I'm doing multi-key sorts so using sortperm! with initialized=true and a stable underlying sort can faciliate this. Unfortunately, since sortperm! doesn't dispatch on alg, I found this impossible and reverted to calling my function something else. My two cents is that all the useful functions that sort.jl exports should dispatch on alg so as to facilitate user extension. Feel free to clarify if I'm missing something here. |
Beta Was this translation helpful? Give feedback.
-
Should we make the |
Beta Was this translation helpful? Give feedback.
-
2.0 proposal to replace all keyword arguments with first-class |
Beta Was this translation helpful? Give feedback.
-
I have a few more suggestions:
|
Beta Was this translation helpful? Give feedback.
-
The |
Beta Was this translation helpful? Give feedback.
-
The |
Beta Was this translation helpful? Give feedback.
-
@LilithHafner, I'd love to get your take on revising the sorting APIs for Julia 2.0 while you're still very fresh in the weeds with it. It was originally designed before keyword arguments were fast and even before higher order functions were fast, which is why it leans so heavily on constructing ordering types and sort algorithm types and passing those to the core sorting logic and dispatching on them since that has basically always been fast. At some point pre-1.0 we considered changing the Ordering stuff, but we decided (I believe it was @JeffBezanson and myself, probably on a rushed triage call—the triage leading up to 1.0 was an intense time) that it was kind of useful to have a first class representation of a sorting order. So I'd love your take on whether you feel that's true or not. I've also often considered it a bit of a mistake that we expose all the gnarly internal methods of
sort!
externally and it seems like we should probably have_sort!
orsortcore!
with all the internal methods and then only have the official API methods forsort!
andsort
.Beta Was this translation helpful? Give feedback.
All reactions