-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bool(x == x)
when x
is masked?
#28
Comments
As far as I can tell, the Array API only defines the
I think this is where we can come up with our own strategy. Raising does make sense to me, but we could just as well return I will need to take some more time to understand what equality would mean in the context of |
@lucascolley can you weigh in from the perspective of whether masked values should be treated as distinct in the set functions? It's easier to implement the set functions if they are the same; see bottom of #28 (comment). |
I think they should be treated as absent from the set. So if a value is masked, it shouldn't contribute to a count in |
Maybe. If we want to be able to recreate the original set (including masked elements) from the (Otherwise, it sounds like what you're proposing is that the set functions are as simple as removing the masked values completely, performing the function, and returning the results as |
that, or the result of |
That is what I was proposing, but I hadn't considered that you may want to reconstruct the original masked array. |
Right. The standard says
With that in mind, do you prefer one of the other options? |
My preference would be to give up on being able to reconstruct the original array (just throw masked elements away), until somebody makes a feature request. I think it would be a cool thing to think through if there is a use-case. But probably not worth the hassle without one! |
There is an option on the table for allowing reconstruction that is simple: treat all masked elements as the same. No headache. I think I will do that for now. |
According to NumPy:
Yet:
Let me try to interpret this:
bool(y == y)
is False becausey == y
is a masked 0D thing, and any masked 0D thing is "falsy".bool(np.ma.all(x==x))
is False becausenp.ma.all(x==x)
reduces to a masked 0D thing.bool(np.all([]))
is True for the same reason thatall([])
is True.bool(x == x)
being True is erroneous all the ways I try to look at it.x == x
is a 1D thing with one masked element. I thinkbool(x == x)
is most comparable tobool([])
, which should beFalse
.All that said, I think it was choice to make
bool
of a masked 0D thing "falsy". Similarly, what should be returned when we try to get anint
/float
/complex
out of a masked number? It would probably be most consistent to raise an error about the ambiguity wheneverbool
/int
/float
/complex
is used on a masked array with only a masked element.All this came up when I was thinking about whether masked elements should be treated as the same or distinct when computing
unique_
functions.The note in the documentation (e.g. here) of these functions states:
But I think this is still ambiguous for the reasons above, so I think we have a choice.
I'll just say that treating all masked elements as the same is easier because then the implementation is as simple as replacing the masked values with a sentinel that is not already in the data. If the masked elements are distinct, I see a few options:
bool
has only two values)np.nan
as the sentinel value becausenan
s are treated as distinct (but this only works for inexact types when there are no NaNs in the data),values
- one masked element for each masked element of the inputindices
- the indices of all the masked elements of the input (e.g.nonzero(mask)
)count
- a separate1
for each masked elementinverse_indices
is a bit more complicated because we can't just append to the results of the compressed array. I think we have to:inverse_indices
from the compressed arrayarange(n_nonmasked, n_total)
(because invalues
, the masked elements appear aftern_nonmasked
values).And all that headache for something that doesn't make much sense to begin with, IMO!
The text was updated successfully, but these errors were encountered: