Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use dot-notation: quantile.(d, X) #586

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

ron-wolf
Copy link

StatsBase.iqr() calls Distributions.quantile().

iqr(x) = (q = quantile(x, [.25, .75]); q[2] - q[1])

However, calling iqr() emits a warning, reproduced by the following code.

using StatsBase, Statistics, Distributions

prices = 0:20:600
probs = pweights([100-99, 99-99, 99-99, 99-99, 99-99, 99-98, 98-94, 94-90, 90-81, 81-74, 74-72, 72-69, 69-67, 67-64, 64-61, 61-58, 58-55, 55-52, 52-48, 48-44, 44-40, 40-36, 36-31, 31-26, 26-21, 21-15, 15-9, 9-3, 3-0, 0-0, 0-0] / 100)

avg_price = mean(prices, probs)
dev_price = std(prices, probs, mean=avg_price)

price_dist = Normal(avg_price, dev_price)
range = iqr(price_dist)
┌ Warning: `quantile(d::UnivariateDistribution, X::AbstractArray)` is deprecated, use `quantile.(d, X)` instead.
│   caller = iqr(::Normal{Float64}) at scalarstats.jl:340
└ @ StatsBase ~/.julia/packages/StatsBase/548SN/src/scalarstats.jl:340

The warning stems from a line in Distributions.

See the pull request for code that calls iqr(), producing a warning from within StatsBase
@ron-wolf
Copy link
Author

As a fun side note, I had a lot of difficulty finding this line. I tried getting at it using methods(), but I could not find find the returned file on GitHub.

using Distributions

methods(quantile, (UnivariateDistribution, AbstractArray))
# 1 method for generic function "quantile":
[1] quantile(d::Distribution{Univariate,S} where S<:ValueSupport, X::AbstractArray) in Distributions at deprecated.jl:65

The documentation for quantile() is available online, but from the Statistics library, begging the question of why documentation only exists for the re-exported name. Even so, the source code link has been broken as of Julia 1.0. Compare the “source code” link in the v1.0 documentation and the v0.7 documentation.

I’m not familiar enough with Julia’s development to ascertain where this additional bug should be reported, so I‘m attaching it here so I’m not the only soul who knows about it. I also wasn’t sure what package versions I’m using, so if those are needed, any guidance would be appreciated.

src/scalarstats.jl Outdated Show resolved Hide resolved
@nalimilan
Copy link
Member

Good catch. Can you also change nquantile?

The documentation for quantile() is available online, but from the Statistics library, begging the question of why documentation only exists for the re-exported name. Even so, the source code link has been broken as of Julia 1.0. Compare the “source code” link in the v1.0 documentation and the v0.7 documentation.

I’m not familiar enough with Julia’s development to ascertain where this additional bug should be reported, so I‘m attaching it here so I’m not the only soul who knows about it. I also wasn’t sure what package versions I’m using, so if those are needed, any guidance would be appreciated.

Yes, that's an unfortunate consequence of the fact that Statistics.jl has been moved to a separate repository. At some point the manual will probably be changed to point to the separate repository too.

@nalimilan
Copy link
Member

Sorry, I just realized that the current form is faster when the input is an array, as it sorts the data only once, while the proposed form would do it for each requested quantile. I think it would make sense to undeprecate that method in Distributions for consistency with Statistics. Can you file an issue there to discuss it?

Broadcast only over probabilities (second argument), not collection (first argument)

Co-authored-by: Milan Bouchet-Valat <[email protected]>
@ron-wolf
Copy link
Author

@nalimilan

Yes, that's an unfortunate consequence of the fact that Statistics.jl has been moved to a separate repository. At some point the manual will probably be changed to point to the separate repository too.

I’m not sure I completely understand. I get that the code has been separated out, while the docs have yet to be ported. But what used to be the case—did Statistics used to be under Distributions?

Can you file an issue there to discuss it?

Yes, will do! Does that mean this PR should be closed?

@nalimilan
Copy link
Member

I’m not sure I completely understand. I get that the code has been separated out, while the docs have yet to be ported. But what used to be the case—did Statistics used to be under Distributions?

No. What used to be the case is that Statistics lived in the Julia repository.

Yes, will do! Does that mean this PR should be closed?

We can keep it open for now.

ron-wolf added a commit to ron-wolf/Distributions.jl that referenced this pull request Aug 17, 2020
Needed by StatsBase.jl for arrays

More info: JuliaStats/StatsBase.jl#586 (comment)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants