-
Notifications
You must be signed in to change notification settings - Fork 194
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Error when negative weights or zero sum are used when sampling #834
base: master
Are you sure you want to change the base?
Changes from 4 commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -5,20 +5,24 @@ abstract type AbstractWeights{S<:Real, T<:Real, V<:AbstractVector{T}} <: Abstrac | |
@weights name | ||
|
||
Generates a new generic weight type with specified `name`, which subtypes `AbstractWeights` | ||
and stores the `values` (`V<:RealVector`) and `sum` (`S<:Real`). | ||
and stores the `values` (`V<:RealVector`), the pre-computed `sum` (`S<:Real`) and | ||
whether all values are `positive`. | ||
""" | ||
macro weights(name) | ||
return quote | ||
mutable struct $name{S<:Real, T<:Real, V<:AbstractVector{T}} <: AbstractWeights{S, T, V} | ||
values::V | ||
sum::S | ||
function $(esc(name)){S, T, V}(values, sum) where {S<:Real, T<:Real, V<:AbstractVector{T}} | ||
positive::Union{Bool, Missing} | ||
nalimilan marked this conversation as resolved.
Show resolved
Hide resolved
|
||
function $(esc(name)){S, T, V}(values, sum, positive=missing) where {S<:Real, T<:Real, V<:AbstractVector{T}} | ||
isfinite(sum) || throw(ArgumentError("weights cannot contain Inf or NaN values")) | ||
return new{S, T, V}(values, sum) | ||
return new{S, T, V}(values, sum, positive) | ||
end | ||
end | ||
$(esc(name))(values::AbstractVector{T}, sum::S) where {S<:Real, T<:Real} = $(esc(name)){S, T, typeof(values)}(values, sum) | ||
$(esc(name))(values::AbstractVector{<:Real}) = $(esc(name))(values, sum(values)) | ||
$(esc(name))(values::AbstractVector{T}, | ||
sum::S=Base.sum(values), | ||
positive::Union{Bool, Missing}=missing) where {S<:Real, T<:Real} = | ||
$(esc(name)){S, T, typeof(values)}(values, sum, positive) | ||
end | ||
end | ||
|
||
|
@@ -53,9 +57,34 @@ Base.getindex(wv::W, ::Colon) where {W <: AbstractWeights} = W(copy(wv.values), | |
isfinite(sum) || throw(ArgumentError("weights cannot contain Inf or NaN values")) | ||
wv.values[i] = v | ||
wv.sum = sum | ||
wv.positive = missing | ||
devmotion marked this conversation as resolved.
Show resolved
Hide resolved
|
||
v | ||
end | ||
|
||
function Base.all(f::Base.Fix2{typeof(>=)}, wv::AbstractWeights) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This may be overkill, but unfortunately that's the standard way of checking whether all entries in a vector in Julia, so that's the only solution if we want external code to be able to use this feature, without exporting a new There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Hmm. It's easy to not hit this function, e.g., when using So I think a dedicated separate function would be cleaner and less ambiguous. If one wants to support isnonneg(x::AbstractArray{<:Real}) = all(>=(0), x) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What annoys me is that there's no reason why this very basic function would live in and be exported by StatsBase. And anyway if users are not aware of the fast path (be it We could keep this internal for now -- though defining an internal function wouldn't be better than There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, I would suggest only defining an internal function - that seems sufficient as it's only used in the argument checks internally. Something like |
||
if iszero(f.x) | ||
if ismissing(wv.positive) | ||
nalimilan marked this conversation as resolved.
Show resolved
Hide resolved
|
||
# sum is significantly faster than all when no entries are negative | ||
wv.positive = sum(<(0), wv.values) == 0 | ||
end | ||
return wv.positive | ||
else | ||
return all(f, wv.values) | ||
end | ||
end | ||
|
||
function Base.any(f::Base.Fix2{typeof(<)}, wv::AbstractWeights) | ||
if iszero(f.x) | ||
if ismissing(wv.positive) | ||
nalimilan marked this conversation as resolved.
Show resolved
Hide resolved
|
||
# sum is significantly faster than all when no entries are negative | ||
wv.positive = sum(<(0), wv.values) == 0 | ||
end | ||
return !wv.positive | ||
else | ||
return any(f, wv.values) | ||
nalimilan marked this conversation as resolved.
Show resolved
Hide resolved
|
||
end | ||
end | ||
|
||
""" | ||
varcorrection(n::Integer, corrected=false) | ||
|
||
|
@@ -333,6 +362,9 @@ end | |
|
||
Base.getindex(wv::UnitWeights{T}, ::Colon) where {T} = UnitWeights{T}(wv.len) | ||
|
||
Base.all(f::Base.Fix2{typeof(>=)}, wv::UnitWeights{T}) where {T} = one(T) >= f.x | ||
Base.any(f::Base.Fix2{typeof(<)}, wv::UnitWeights{T}) where {T} = one(T) < f.x | ||
|
||
""" | ||
uweights(s::Integer) | ||
uweights(::Type{T}, s::Integer) where T<:Real | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think these efraimidis functions were actually one of my first contributions to Julia packages. Fun times but I'm not surprisied that some things can be improved and made more consistent 😄