Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support non-interpolating quantile definitions #187

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

nalimilan
Copy link
Member

Add a type argument to quantile to support the three remaining types that we didn't support. Some of these are useful in particular because they correspond to actual values from the data and work for types that do not support arithmetic.

Fixes #185.

Add a `type` argument to `quantile` to support the three remaining
(non-interpolating) types that we didn't support. Some of these are
useful in particular because they correspond to actual values from
the data and work for types that do not support arithmetic.
@codecov-commenter
Copy link

Codecov Report

Attention: Patch coverage is 93.47826% with 3 lines in your changes missing coverage. Please review.

Project coverage is 96.18%. Comparing base (bfa5c6b) to head (02e40c2).

Files with missing lines Patch % Lines
src/Statistics.jl 93.47% 3 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master     #187      +/-   ##
==========================================
- Coverage   96.65%   96.18%   -0.47%     
==========================================
  Files           2        2              
  Lines         448      472      +24     
==========================================
+ Hits          433      454      +21     
- Misses         15       18       +3     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Comment on lines +1079 to +1086
if type == 1
return v[clamp(ceil(Int, n*p), 1, n)]
elseif type == 2
i = clamp(ceil(Int, n*p), 1, n)
j = clamp(floor(Int, n*p + 1), 1, n)
return middle(v[i], v[j])
elseif type == 3
return v[clamp(round(Int, n*p), 1, n)]
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have used simplified formulas specific to each case, as I find code resulting from using the single general formula from the Hyndman & Fan paper very hard to grasp without any advantage. I hope I didn't introduce mistakes, especially in corner cases. Please suggest things to test if you can find some that are not covered.

Comment on lines +1140 to +1142
- `type=1`: `Q(p) = x[ceil(n*p)]` (SAS-3)
- `type=2`: `Q(p) = middle(x[ceil(n*p), floor(n*p + 1)])` (SAS-5, Stata)
- `type=3`: `Q(p) = x[round(n*p)]` (SAS-2)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here I have even simplified formulas a bit more and don't even mention clamping at the extremes. Maybe that's too much and it should at least be mentioned somewhere (e.g. in the sentence common to the three types)?

@@ -797,6 +806,46 @@ end
@test quantile(v, 1.0, alpha=0.0, beta=0.0) ≈ 21.0
@test quantile(v, 1.0, alpha=1.0, beta=1.0) ≈ 21.0

# tests against R's quantile with type=1
@test quantile(v, 0.0, type=1) === 2
Copy link
Member Author

@nalimilan nalimilan Jan 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As you can see the type of the result can be different for types 1 and 3 as we keep the original type instead of using whatever type arithmetic operations produce. This makes sense and is even necessary if we want to work for types that don't support arithmetic.

But this means the inferred return type when passing type is a Union of two types. Maybe OK as it's small enough to be optimized out? If not there are probably ways to ensure inference works via combined use of Val(type) and @inline in a wrapper function.

EDIT: The situation is slightly worse for quantile([1, 2], [0.1, 0.2]) as the inferred type is Union{Vector{Float64}, Vector{Int64}, Vector{Real}}.

(Note that when omitting type, the inferred type is concrete as before so at least there's no regression.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support for non-interpolating quantile computation
2 participants