[Numpy-discussion] Re: Percentile/Quantile "interpolation" refactor

2021-11-04 Thread Sebastian Berg
On Wed, 2021-10-13 at 10:25 -0500, Sebastian Berg wrote:
> Hi all,
> 
> after a long time Abel has helped us and refactored the quantile and
> percentile functions' `interpolation` keyword.
> 

This PR is now merged to be included in the upcoming 1.22 release. 
Please don't hesitate in case there is any concern about it, all notes
from the old email remain unchanged.

There is a good chance that the documentation could use a bit of
revising so input would be greatly appreciated!

The one thing that I definitely plan to do before the next release is
to rename the `interpolation` keyword argument to `method`. Method
seems a much clearer name and it forces users who do not use the
default to consider switching to a more standard methods.
(Only the default version is really described in literature.)

Cheers,

Sebastian


> This was long overdue since NumPy implements three (the non-default)
> interpolation methods that appear to be very much non-standard.  On
> the
> other hand, NumPy currently has no unbiased methods (i.e. population
> estimate).
> 
> There are two main questions right now with respect to the API. 
> First
> which names to use for the methods and second, how to deal with
> "outliers".
> 
> 
> The PR
> https://github.com/numpy/numpy/pull/19857#issuecomment-939852134
> adds the methods and gives them (currently) the following names
> (sorted
> by the R methods) – the names will be used as string identifiers:
> 
> 1. inverted cdf
> 2. averaged inverted cdf
> 3. closest observation
> 4. interpolated inverted cdf
> 5. hazen  (name from wolfram)
> 6. weibull  (name from wolfram)
> 7. linear  (default!  Better name deferred)
> 8. median unbiased
> 9. normal unbiased
> 
> And additionally the four ones we currently have:
> 
> * lower
> * higher
> * nearest
> * midpoint
> 
> Number 5. and 6. are named "exclusive" and "inclusive" by Python in
> their `method` keyword argument.  While I like the name `method=` and
> may want to move to it, I am not sure I like "inclusive" and
> "exclusive".
> The current plan was to defer the kwarg rename into a followup,
> although it should be discussed before the next release.
> 
> 
> The second main question is how to deal with outliers (this does not
> affect the default method 7, which finds the sample quantiles and not
> a
> population estimate).  Wikipedia says this:
> 
>     Packages differ in how they estimate quantiles beyond the lowest
>     and highest values in the sample, i.e. p < 1/N and p > (N − 1)/N.
>     Choices include returning an error value, computing linear
>     extrapolation, or assuming a constant value.
> 
> The current choice is clipping (assuming a constant value), but this
> could be modified.
> 
> 
> Any feedback is appreciated!  Otherwise, this will probably move
> forward in the current state for the next release.
> 
> Cheers,
> 
> Sebastian
> ___
> NumPy-Discussion mailing list -- numpy-discussion@python.org
> To unsubscribe send an email to numpy-discussion-le...@python.org
> https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
> Member address: sebast...@sipsolutions.net



signature.asc
Description: This is a digitally signed message part
___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com


[Numpy-discussion] NumPy 1.21.4 release.

2021-11-04 Thread Charles R Harris
Hi All,

On behalf of the NumPy team I am pleased to announce the release of NumPy
1.21.4. NumPy 1.21.4 is a maintenance release that fixes a few bugs
discovered after 1.21.3. The most important fix here is for the NumPy
header files to make them work for both x86_64 and M1 hardware when
included in the Mac universal2 wheels. Previously, the header files only
worked for M1 and this caused problems for folks building x86_64
extensions. This problem was not seen before Python 3.10 because there were
thin wheels for x86_64 that had precedence. This release also provides thin
x86_64 Mac wheels for Python 3.10. Note that there are *no* manylinux1
wheels, the needed CentOS 5 repos have gone missing. Make sure you upgrade
your pip version if you have problems downloading the correct wheels.

The Python versions supported in this release are 3.7-3.10. If you want to
compile your own version using gcc-11 you will need to use gcc-11.2+ to
avoid problems. Wheels can be downloaded from PyPI
; source archives, release notes,
and wheel hashes are available on Github
. Linux users will
need pip >= 0.19.3 in order to install manylinux2010 and manylinux2014
wheels. A recent version of pip is needed to install the universal2
macos wheels.


*Contributors*

A total of 7 people contributed to this release.  People with a "+" by their
names contributed a patch for the first time.

   - Bas van Beek
   - Charles Harris
   - Isuru Fernando
   - Matthew Brett
   - Sayed Adel
   - Sebastian Berg
   - 傅立业(Chris Fu) +



*Pull requests merged*
A total of 9 pull requests were merged for this release.

   - #20278: BUG: Fix shadowed reference of `dtype` in type stub
   - #20293: BUG: Fix headers for universal2 builds
   - #20294: BUG: ``VOID_nonzero`` could sometimes mutate alignment flag
   - #20295: BUG: Do not use nonzero fastpath on unaligned arrays
   - #20296: BUG: Distutils patch to allow for 2 as a minor version (!)
   - #20297: BUG, SIMD: Fix 64-bit/8-bit integer division by a scalar
   - #20298: BUG, SIMD: Workaround broadcasting SIMD 64-bit integers on
   MSVC...
   - #20300: REL: Prepare for the NumPy 1.21.4 release.
   - #20302: TST: Fix a ``Arrayterator`` typing test failure

Cheers,

Charles Harris
___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com