[Numpy-discussion] Re: Fixing definition of reduceat for Numpy 2.0?
On Fri, 2023-12-22 at 18:01 -0500, Marten van Kerkwijk wrote: > Hi Martin, > > I agree it is a long-standing issue, and I was reminded of it by your > comment. I have a draft PR at > https://github.com/numpy/numpy/pull/25476 > that does not change the old behaviour, but allows you to pass in a > start-stop array which behaves more sensibly (exact API TBD). > > Please have a look! That looks nice, I don't have a clear feeling on the order of items, if we think of it in terms of `(start, stop)` there was also the idea voiced to simply add another name in which case you would allow start and stop to be separate arrays. Of course if go with your `slice(start, stop)` idea that also works, although passing as separate parameters seems nice too. Adding another name (if we can think of one at least) seems pretty good to me, since I suspect we would add docs to suggest not using `reduceat`. One small thing about the PR: I would like to distinct `default` and `initial`. I.e. the default value is used only for empty reductions, while the initial value should be always used (unless you would pass both, which we don't for normal reductions though). I suppose the machinery isn't quite set up to do both side-by-side. - Sebastian > > Marten > > Martin Ling writes: > > > Hi folks, > > > > I don't follow numpy development in much detail these days but I > > see > > that there is a 2.0 release planned soon. > > > > Would this be an opportunity to change the behaviour of 'reduceat'? > > > > This issue has been open in some form since 2006! > > https://github.com/numpy/numpy/issues/834 > > > > The current behaviour was originally inherited from Numeric, and > > makes > > reduceat often unusable in practice, even where it should be the > > perfect, concise, efficient solution. But it has been impossible to > > change it without breaking compatibіlity with existing code. > > > > As a result, horrible hacks are needed instead, e.g. my answer > > here: > > https://stackoverflow.com/questions/57694003 > > > > Is this something that could finally be fixed in 2.0? > > > > > > Martin > > ___ > > NumPy-Discussion mailing list -- numpy-discussion@python.org > > To unsubscribe send an email to numpy-discussion-le...@python.org > > https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ > > Member address: m...@astro.utoronto.ca > ___ > NumPy-Discussion mailing list -- numpy-discussion@python.org > To unsubscribe send an email to numpy-discussion-le...@python.org > https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ > Member address: sebast...@sipsolutions.net ___ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-le...@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: arch...@mail-archive.com
[Numpy-discussion] Re: Fixing definition of reduceat for Numpy 2.0?
Hi Sebastian, > That looks nice, I don't have a clear feeling on the order of items, if > we think of it in terms of `(start, stop)` there was also the idea > voiced to simply add another name in which case you would allow start > and stop to be separate arrays. Yes, one could add another method. Or perhaps even add a new argument to `.reduce` instead (say `slices`). But this seemed the simplest route... > Of course if go with your `slice(start, stop)` idea that also works, > although passing as separate parameters seems nice too. > > Adding another name (if we can think of one at least) seems pretty good > to me, since I suspect we would add docs to suggest not using > `reduceat`. If we'd want to, even with the present PR it would be possible to (very slowly) deprecate the use of a list of single integers. But I'm trying to go with just making the existing method more useful. > One small thing about the PR: I would like to distinct `default` and > `initial`. I.e. the default value is used only for empty reductions, > while the initial value should be always used (unless you would pass > both, which we don't for normal reductions though). > I suppose the machinery isn't quite set up to do both side-by-side. I just followed what is done for reduce, where a default could also have made sense given that `where` can exclude all inputs along a given row. I'm not convinced it would be necessary to have both, though it would not be hard to add. All the best, Marten ___ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-le...@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: arch...@mail-archive.com
[Numpy-discussion] Re: Fixing definition of reduceat for Numpy 2.0?
On Sat, 2023-12-23 at 09:56 -0500, Marten van Kerkwijk wrote: > Hi Sebastian, > > > That looks nice, I don't have a clear feeling on the order of > > items, if > > we think of it in terms of `(start, stop)` there was also the idea > > voiced to simply add another name in which case you would allow > > start > > and stop to be separate arrays. > > Yes, one could add another method. Or perhaps even add a new > argument > to `.reduce` instead (say `slices`). But this seemed the simplest > route... Yeah, I don't mind this, doesn't stop us from a better idea either. Adding to `.reduce` could be fine, but overall I actually think a new name or using `reduceat` is nicer than overloading it more, even `reduce_slices()`. > > > > > > I suppose the machinery isn't quite set up to do both side-by-side. > > I just followed what is done for reduce, where a default could also > have > made sense given that `where` can exclude all inputs along a given > row. > I'm not convinced it would be necessary to have both, though it would > not be hard to add. Sorry, I misread the code: You do use initial the same way as in reductions, I thought it wasn't used when there were multiple elements. I.e. it is used for non-empty slices also. There is still a little annoyance when `initial=` isn't passed, since default/initial can be different (this is the case for object add for example: the default is `0`, but it is not used as initial for non empty reductions). Anyway, its a small details to some degree even if it may be finicky to get right. At the moment it seems passing `dtype=object` somehow changes the result also. - Sebastian > > All the best, > > Marten > ___ > NumPy-Discussion mailing list -- numpy-discussion@python.org > To unsubscribe send an email to numpy-discussion-le...@python.org > https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ > Member address: sebast...@sipsolutions.net > ___ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-le...@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: arch...@mail-archive.com
[Numpy-discussion] Re: Change definition of complex sign (and use it in copysign)
To me this sounds like a reasonable change. It does seem like there is a return value which is more sensible than alternatives. And the fact that sympy is already doing that indicates that same conclusion was reached more than once. I am not dealing much with complex numbers at the moment, but I see this being of non-trivial convenience when I need to. Regards, DG > On 22 Dec 2023, at 15:48, Oscar Benjamin wrote: > > On Fri, 22 Dec 2023 at 13:25, wrote: >> >> Anyway, to me the main question would be whether this would break any >> workflows (though it is hard to see how it could, given that the previous >> definition was really rather useless...). > > SymPy already defines sign(z) as z/abs(z) (with sign(0) = 0) as proposed here. > > Checking this I see that the current mismatch causes a bug when > SymPy's lambdify function is used to evaluate the sign function with > NumPy: > > In [12]: sign(z).subs(z, 1+1j) > Out[12]: 0.707106781186548 + 0.707106781186548⋅ⅈ > > In [13]: lambdify(z, sign(z))(1+1j) # uses numpy > Out[13]: (1+0j) > > The proposed change to NumPy's sign function would fix this bug. > > -- > Oscar > ___ > NumPy-Discussion mailing list -- numpy-discussion@python.org > To unsubscribe send an email to numpy-discussion-le...@python.org > https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ > Member address: dom.grigo...@gmail.com ___ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-le...@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: arch...@mail-archive.com
[Numpy-discussion] ENH: ignore_nan parameter in np.average
Hi there, I would love some input on this issue, https://github.com/numpy/numpy/issues/21375. I have created a draft PR which I believe resolves it, https://github.com/numpy/numpy/pull/25474. Decided to go with adding ignore_nan to np.average rather than adding weights to np.nanmean as it seemed logical and was the path of least resistance. Please let me know what you think! -awestm ___ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-le...@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: arch...@mail-archive.com
[Numpy-discussion] Re: ENH: ignore_nan parameter in np.average
I think the logical choice would be to allow a weighted average in the nan-functions, not to change `np.average`. In particular, I think it is bad API to add options for such exclusions; we have another open issue about infinitiy in the data, which leads to nan even with weight 0 (because `0*inf = nan`; see https://github.com/numpy/numpy/issues/25362); should that gets its own option? It also seems a case that is *very* easily solved by the user with a wrapper, or, as noted in the issue, with creating a mask and using `np.ma.average`. Or, indeed, just calculating `np.nansum(data * weight, axis=...) / np.nansum(weight, axis=...)` Anyway, my 2¢ would be to think whether we actually need this addition, and, if we decide we do, to see how this would fit with the `nan*` functions. This might mean introducing `np.nanaverage` since the Array API is fairly clear that `mean()` should be unweighted. Indeed, if written such that the `NaN` skipping is done after data and weight are multiplied, that would solve https://github.com/numpy/numpy/issues/25362 too. -- Marten p.s. My other reason for hesitation is that generally when you get a NaN, something is actually wrong... > I would love some input on this issue, > https://github.com/numpy/numpy/issues/21375. > > I have created a draft PR which I believe resolves it, > https://github.com/numpy/numpy/pull/25474. Decided to go with adding > ignore_nan to np.average rather than adding weights to np.nanmean as it > seemed logical and was the path of least resistance. > Please let me know what you think! > -awestm ___ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-le...@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: arch...@mail-archive.com