-
Notifications
You must be signed in to change notification settings - Fork 194
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
entropy with isprobvec check #865
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think silently incorrect results should be avoided as much as possible. I want to mention though that based on my experience in Distributions, this PR trades off safety versus performance and convenience: Due to numerical inaccuracies the check can fail even if the user computes the input vector in such a way that in non-floating point math it would be normalized.
A simple benchmark:
master:
julia> using StatsBase, Zygote, BenchmarkTools
julia> @btime entropy($(fill(1e-5, 10^5)));
472.918 μs (0 allocations: 0 bytes)
julia> _, pb = Zygote.pullback(entropy, fill(1e-5, 10^5));
julia> @btime $pb(1.0);
74.084 μs (16 allocations: 781.58 KiB)
This PR:
julia> using StatsBase, Zygote, BenchmarkTools
julia> @btime entropy($(fill(1e-5, 10^5)));
512.763 μs (0 allocations: 0 bytes)
julia> _, pb = Zygote.pullback(entropy, fill(1e-5, 10^5));
julia> @btime $pb(1.0);
79.138 μs (21 allocations: 781.73 KiB)
A more general comment: Can you add tests?
return -sum(xlogx, p) | ||
end | ||
|
||
entropy(p, b::Real; check::Bool = true) = entropy(p; check) / log(b) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we check b
as well? In any case, we need
entropy(p, b::Real; check::Bool = true) = entropy(p; check) / log(b) | |
entropy(p, b::Real; check::Bool = true) = entropy(p; check = check) / log(b) |
|
||
Checks whether `p` is a probability vector, i.e. p[i] >= 0 for each index i, and sum(p) ≈ 1. | ||
Taken from `Distributions.isprobvec`.""" | ||
isprobvec(p::AbstractVector{<:Real}) = all(x -> x ≥ zero(x), p) && isapprox(sum(p), one(eltype(p))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be nice to support tolerances here due to floating point inaccuracies but I don't see a nice way to forward them to this function.
fixes #769 docstrings added, tests missing