[Numpy-discussion] Indexing issue with ndarrays

2016-08-25 Thread Joseph Fox-Rabinovitz
This issue recently came up on Stack Overflow:
http://stackoverflow.com/questions/39145795/masking-a-series-with-a-boolean-array.
The poster attempted to index an ndarray with a pandas boolean Series
object (all False), but the result was as if he had indexed with an array
of integer zeros.

Can someone explain this behavior? I can see two obvious possibilities:

   1. ndarray checks if the input to __getitem__ is of exactly the right
   type, not using instanceof.
   2. pandas actually uses a wider datatype than boolean internally, so
   indexing with the series is in fact indexing with an integer array.

In my attempt to reproduce the poster's results, I got the following
warning:

FutureWarning: in the future, boolean array-likes will be handled as a
boolean array index

This indicates that the issue is probably #1 and that a fix is already on
the way. Please correct me if I am wrong. Also, where does the code for
ndarray.__getitem__ live?

Thanks,

-Joe
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Indexing issue with ndarrays

2016-08-25 Thread Sebastian Berg
On Do, 2016-08-25 at 10:36 -0400, Joseph Fox-Rabinovitz wrote:
> This issue recently came up on Stack Overflow: http://stackoverflow.c
> om/questions/39145795/masking-a-series-with-a-boolean-array. The
> poster attempted to index an ndarray with a pandas boolean Series
> object (all False), but the result was as if he had indexed with an
> array of integer zeros.
> 
> Can someone explain this behavior? I can see two obvious
> possibilities:
> ndarray checks if the input to __getitem__ is of exactly the right
> type, not using instanceof.
> pandas actually uses a wider datatype than boolean internally, so
> indexing with the series is in fact indexing with an integer array.

You are overthinking it ;). The reason is quite simply that the logic
used to be:

 * Boolean array? -> think about boolean indexing.
 * Everything "array-like" (not caught earlier) -> convert to `intp`
array and do integer indexing.

Now you might wonder why, but probably it is quite simply because
boolean indexing was tagged on later.

- Sebastian


> In my attempt to reproduce the poster's results, I got the following
> warning:
> FutureWarning: in the future, boolean array-likes will be handled as
> a boolean array index
> This indicates that the issue is probably #1 and that a fix is
> already on the way. Please correct me if I am wrong. Also, where does
> the code for ndarray.__getitem__ live?
> Thanks,
>     -Joe
> 
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion

signature.asc
Description: This is a digitally signed message part
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion