[Numpy-discussion] Re: np.where and ZeroDivisionError: float division by zero

2024-04-26 Thread Lucas Colley
> What you are hoping for here is known as "short circuit" or "lazy" evaluation.

In SciPy, we have the private utility function `_lazywhere`[1] for this.

Cheers,
Lucas

[1] 
https://github.com/scipy/scipy/blob/f44326023dc51758495491fc9f06858fd38358a0/scipy/_lib/_util.py#L88-L156

> On 26 Apr 2024, at 00:18, rosko37  wrote:
> 
> What you are hoping for here is known as "short circuit" or "lazy" 
> evaluation. Namely, this would work if np.where(cond, x, y) only evaluates x 
> if cond is true and only evaluates y if cond is false. In this case, not only 
> can it handle situations where one of the true/false cases "breaks" one of 
> the possible return values, it would also potentially save a lot of work if 
> one of x of y is expensive to evaluate. This would clearly be the case if 
> there were an explicit branch in the code, i.e. something of the form 
> 
> for row in score_df:
> if number[row] = 0:
> return None
> else:
> return 100 - ((100*rank_column[row] - 50)/number[row])
> 
> It COULD also be true for a vectorized operation IF it is implemented that 
> way. However, numpy.where is NOT implemented like this. Its parameters are 
> ALL numpy arrays themselves, and are each precomputed separately. Only then 
> is the Boolean array that represents the outcomes of the conditional 
> "combined" with the other two arrays x and y to produce the result.
> 
> What you want is to do something along the lines of fancy indexing, where you 
> evaluate the condition on the array to get an explicit Boolean mask, and then 
> use this to select a "slice" (not a real contiguous slice, but a subset) of 
> rows of the dataframe to pass to the expression that may break for zero 
> values. 
> 
> On Thu, Apr 25, 2024 at 5:19 AM 840362492--- via NumPy-Discussion 
> mailto:numpy-discussion@python.org>> wrote:
>> 0
>> 
>> In my code, I use the following calculation for a column in the dataframe: 
>> np.where(df_score['number'] ! = 0, 100 - ((100 * df_score[rank_column] 
>> -50)/df_score['number']), None),I have used df_score['number']! = 0, but the 
>> code is still wrong, ZeroDivisionError: float division by zero, even if I 
>> put df_score['number']! = 0 changed to df_score['number'] > 0, why?
>> 
>> pandas version:1.1.5 numpy version:1.24.4
>> 
>> Here are my numbers: 12.0 12.0 12.0 12.0 12.0 0.0 
>> 0.0 0.0 0.0 0.0 12.0 12.0 12.0
>> 
>> I want to know why it went wrong and what should be done to fix it? Thank 
>> you for your help
>> ___
>> NumPy-Discussion mailing list -- numpy-discussion@python.org 
>> 
>> To unsubscribe send an email to numpy-discussion-le...@python.org 
>> 
>> https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
>> Member address: rosk...@gmail.com 
> ___
> NumPy-Discussion mailing list -- numpy-discussion@python.org 
> 
> To unsubscribe send an email to numpy-discussion-le...@python.org 
> 
> https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
> Member address: lucas.coll...@gmail.com 
___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com


[Numpy-discussion] Re: A better syntax for using ufunc.at?

2024-07-24 Thread Lucas Colley
While not about exactly your proposed syntax, the discussion in 
https://github.com/data-apis/array-api/issues/609 is relevant.

Cheers,
Lucas

> On 24 Jul 2024, at 11:49, Oras P.  wrote:
> 
> I am aware that to do unbuffered addition operation, I can use `np.add.at` 
> like this:
> ```
> np.add.at(arr, idxs, vals)
> ```
> I think this syntax looks a bit unnatural, and it is not obvious what it does 
> at first glance. An idea I have is to use a custom accessor, like
> ```
> arr.at[idxs] += vals
> # or 
> arr.unbuffered[idxs] += vals
> ```
> While I'm not fluent in Numpy's working mechanisms, this seems possible to 
> implement by just having the method `.at`/`.unbuffered` return a reference to 
> the original array with a special flag set, then make the `+=` operator, etc 
> check this flag and use the unbuffered operation accordingly. 
> 
> Has this kind of feature been proposed at all? I did try to search Github and 
> this mailing list for something similar, but I'm quite new to numpy 
> development, so apologies in advance if this is not the right place to ask.
> ___
> NumPy-Discussion mailing list -- numpy-discussion@python.org
> To unsubscribe send an email to numpy-discussion-le...@python.org
> https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
> Member address: lucas.coll...@gmail.com

___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com


[Numpy-discussion] Re: ENH: Uniform interface for accessing minimum or maximum value of a dtype

2024-08-25 Thread Lucas Colley
+1 for the general idea!

It may be nice to have such a function which sits at the top level of the API, 
to fit into 
https://data-apis.org/array-api/draft/API_specification/data_type_functions.html
 nicely. However, ‘min_value’ or ‘min‘ won’t do then - we’d probably need to 
include ‘dtype’ in the name somewhere. But I don’t really like 
`np.min_dtype(dt)`. Maybe `np.min_dtype_value(dt)`?

Cheers,
Lucas

> On 25 Aug 2024, at 20:59, Carlos Martin  wrote:
> 
> As discussed 
> [here](https://github.com/numpy/numpy/issues/5032#issuecomment-1830838701), 
> [here](https://github.com/numpy/numpy/issues/5032#issuecomment-2307927804), 
> and 
> [here](https://github.com/google/jax/issues/18661#issuecomment-1829031914), 
> I'm interested in a uniform interface for accessing the minimum or maximum 
> value of a given dtype.
> 
> Currently, this requires branching on the type of dtype (boolean, integer, or 
> floating point) and then (for the latter two) calling either 
> [iinfo](https://numpy.org/doc/stable/reference/generated/numpy.iinfo.html) or 
> [finfo](https://numpy.org/doc/stable/reference/generated/numpy.finfo.html), 
> respectively. It would be more ergonomic to have a single, uniform interface 
> for accessing this information that is dtype-independent.
> 
> Possible interfaces include:
> 
> ```python3
> import numpy as np
> dt = np.dtype('int32')
> 
> dt.min
> np.dtypes.info(dt).min
> np.dtypes.min(dt)
> np.dtypes.min_value(dt)
> ```
> ___
> NumPy-Discussion mailing list -- numpy-discussion@python.org
> To unsubscribe send an email to numpy-discussion-le...@python.org
> https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
> Member address: lucas.coll...@gmail.com
___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com


[Numpy-discussion] Re: ENH: Uniform interface for accessing minimum or maximum value of a dtype

2024-08-26 Thread Lucas Colley
Or how about `np.dtype_info(dt)`, which could return an object with attributes 
like `min` and `max`. Would that be possible?
___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com


[Numpy-discussion] Re: Add diagonal offset argument to all functions that are missing it

2025-02-11 Thread Lucas Colley via NumPy-Discussion
Ralf Gommers wrote:
> This sounds quite reasonable to me. The `k=0` keyword is quite badly named,
> which is my one concern. Especially when tacking it on at the end of a
> signature with already 3-4 keywords, it's not a good name. How about
> something like `diag_offset`?

FWIW, we chose `offset` for `array_api_extra.create_diagonal`, instead of 
inheriting `k` from `np.diag`. `np.diagonal` and `np.linalg.trace` also use 
`offset`. Given that all of the proposed functions apart from `np.identity` 
already have "diag" as a substring of their name, I think just `offset` would 
be fine. What else could `offset` mean in the case of `np.identity`?

I suppose there is an argument for actually leaving `np.identity` as is—if 
someone wants a square off-diagonal matrix of ones, which isn't an identity 
matrix, their code might be more readable with `np.diag(np.ones(...), k=1)` or 
just `np.eye` instead, right? Maybe it is weird for "eye" but not "identity" to 
have this capability, though.

Cheers,
Lucas
___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com


[Numpy-discussion] Re: Making `T` property Array API compatible

2025-04-12 Thread Lucas Colley via NumPy-Discussion
> The new discrepancy between `arr.T` and `arr.transpose()` is justified, as
> `T` is defined by the Array API, where `transpose` isn't and should retain
> the existing behavior.


The other side of the coin here is that this change would fix the
discrepancy between `arr.T` and the functions `np.matrix_transpose` and
`np.linalg.matrix_transpose`, which implement batched transpose over
matrices in the 2 innermost dimensions, rather than reversing all axes.

In [*10*]: X = np.stack((np.eye(2), np.eye(2)))

In [*12*]: X.T

Out[*12*]:

array([[[1., 1.],

[0., 0.]],


   [[0., 0.],

[1., 1.]]])


In [*13*]: np.matrix_transpose(X)

Out[*13*]:

array([[[1., 0.],

[0., 1.]],


   [[1., 0.],

[0., 1.]]])

Cheers,
Lucas

On 12 Apr 2025, at 11:14, Mateusz Sokol  wrote:

Hi all!

The Array API standard states that `T` property should only be applied to
2-dimensional arrays, in all other cases it should raise an error:
https://data-apis.org/array-api/latest/API_specification/generated/array_api.array.T

To ensure that NumPy also follows this rule, I opened a PR that raises a
warning for `arr.T` for non-2-dimensional arrays and scalars:
https://github.com/numpy/numpy/pull/28678

For non-2-dimensional arrays, the replacement for `arr.T` can be either:
Array API compatible, namely `np.permute_dims(arr, range(arr.ndim)[::-1])`,
or shorter, NumPy specific: `arr.transpose()`.

The new discrepancy between `arr.T` and `arr.transpose()` is justified, as
`T` is defined by the Array API, where `transpose` isn't and should retain
the existing behavior.

Please share your thoughts!

Best regards,
Mateusz
___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: lucas.coll...@gmail.com
___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com


[Numpy-discussion] Re: Making `T` property Array API compatible

2025-04-14 Thread Lucas Colley via NumPy-Discussion
If NumPy were to make a move on the deprecation, then I think it would be 
reasonable to change the standard from

> If the array instance is not two-dimensional, an error should be raised.

to “if the array instance is not two-dimensional, behaviour should match `.mT`, 
or an error should be raised.”

> On 14 Apr 2025, at 08:35, Sebastian Berg  wrote:
> 
> On Sat, 2025-04-12 at 10:10 +, Mateusz Sokol wrote:
>> Hi all!
>> 
>> The Array API standard states that `T` property should only be
>> applied to 2-dimensional arrays, in all other cases it should raise
>> an error:
>> https://data-apis.org/array-api/latest/API_specification/generated/array_api.array.T
>> 
>> To ensure that NumPy also follows this rule, I opened a PR that
>> raises a warning for `arr.T` for non-2-dimensional arrays and
>> scalars: https://github.com/numpy/numpy/pull/28678
> 
> 
> There was once a surprising amount of resistance to doing this exact
> change a long time ago in NumPy and that is exactly why it never
> happened earlier and that is why we have `.mT` and not just `.T` to
> begin with.
> 
> I am still happy with slowly deprecating it with a message to use
> `arr.transpose()`,  `np.moveaxis()`, or `.mT` when it is applies.
> Maybe making sure that `.T` keeps working at least for 2-D, and think
> about what to do for 1-D (probably an error), although I am not sure
> about this unless there is a long term plan to consider allowing `.T`
> to mean the same as `.mT`.
> 
> 
> That sais, it is not correct to say there is any incompatibility!  This
> is an opinionated recommendation at most and if it was more it would
> probably be a mistake there.
> 
> - Sebastian
> 
> 
> 
>> 
>> For non-2-dimensional arrays, the replacement for `arr.T` can be
>> either: Array API compatible, namely `np.permute_dims(arr,
>> range(arr.ndim)[::-1])`, or shorter, NumPy specific:
>> `arr.transpose()`.
>> 
>> The new discrepancy between `arr.T` and `arr.transpose()` is
>> justified, as `T` is defined by the Array API, where `transpose`
>> isn't and should retain the existing behavior.
>> 
>> Please share your thoughts!
>> 
>> Best regards,
>> Mateusz
>> ___
>> NumPy-Discussion mailing list -- numpy-discussion@python.org
>> To unsubscribe send an email to numpy-discussion-le...@python.org
>> https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
>> Member address: sebast...@sipsolutions.net
>> 
> 
> ___
> NumPy-Discussion mailing list -- numpy-discussion@python.org
> To unsubscribe send an email to numpy-discussion-le...@python.org
> https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
> Member address: lucas.coll...@gmail.com
___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com