[Numpy-discussion] Re: ANN: NumExpr 2.8.6 Released

2023-09-13 Thread Jerome Kieffer
On Tue, 12 Sep 2023 14:56:08 -0700
Robert McLeod  wrote:

> Hi everyone,
> 
> NumExpr 2.8.6 is a release to deal with issues related to downstream
> `pandas`
> where the sanitization blacklist was hitting private variables used in their
> evaluate. In addition the sanitization was hitting on scientific notation.

I believe this sanitization is a bit too strong since the Avogadro
number is no more compatible with numexpr:
numexpr.evaluate("6.02e23")
nor is the charge of the electron:
numexpr.evaluate("1.6e-19")
...

I would call this a bug !

> For those who do not wish to have sanitization on by default, it can
> be changed
> by setting an environment variable, `NUMEXPR_SANITIZE=0`.

Noted ...

-- 
Jérôme Kieffer
___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com


[Numpy-discussion] Re: Change in numpy.percentile

2023-10-08 Thread Jerome Kieffer
On Fri, 06 Oct 2023 19:17:22 -
norbertpiotraduc...@gmail.com wrote:

> Hi, 
> I have an idea to change the numpy.percentile. Think numpy.percentile and 
> numpy.nanpercentyl are the same features, and the only difference is that 
> numpy.nanpercentyl doesn't include NaN values. Wouldn't it be easier if 
> numpy.percentile included an argument specifying whether NaN values should be 
> considered? It would certainly be easier for people who are starting their 
> adventure with the library.

I do like this idea. +1

-- 
Jérôme Kieffer
___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com


[Numpy-discussion] Re: Improved 2DFFT Approach

2024-03-12 Thread Jerome Kieffer
On Tue, 12 Mar 2024 11:34:40 -
via NumPy-Discussion  wrote:

> https://github.com/2D-FFT-Project/2d-fft/blob/main/notebooks/comparisons.ipynb

Hi,

Since you are using a notebook to perform the benchmark, I would advise you to 
use:
```
timing = %timeit -o some_function(*args)
```

Because you are using the `time.time()` function which is not very
precise. `time.perf_counter()` would be better but `timeit` is the best.
Also, it allows to run for longer thus more stable values (let the
speed of the CPU stabilize).

That said, I did not measure your code faster than numpy or scipy
implementations ... despite you are using some "un-advisable" practices
like "-O3" and "--fast-math" by default.

This brings me back 20 years in time when I played with the Intel
compiler, which was much faster (~ +20% vs gcc) for each iteration but
surprisingly it needed also +50% in the number of iteration to achieve
a given precision. I remember loosing 6 month of work because of these
options. One should only activate those options on a limited piece of
code where it is known to be NOT harmful.

Cheers,
-- 
Jérôme Kieffer
___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com


Re: [Numpy-discussion] Generalized rectangle intersection. (Was: Array blitting)

2017-07-10 Thread Jerome Kieffer
On Sun, 9 Jul 2017 23:35:58 +0200
Mikhail V  wrote:

> disclaimer: I am not a past contributor to numpy and I don't know
> much about github, and what pull request means. So I just put the
> examples here.
> 
> So in short, the proposal idea is to add a library function which
> calculates the intersection
> area of two rectangles, generalized for any dimensions.

I am using this kind of clipping as well but in the case you are
suggesting the boxes looks aligned to the axis which is limiting to me.
The general case is much more complicated (and interesting to me :)

Moreover the scikit-image may be more interested for this algorithm.

Cheers, 

-- 
Jérôme Kieffer
tel +33 476 882 445
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Direct GPU support on NumPy

2018-01-02 Thread Jerome Kieffer
On Tue, 02 Jan 2018 15:37:16 +
Yasunori Endo  wrote:

> If the reason is just about human resources,
> I'd like to try implementing GPU support on my NumPy fork.
> My goal is to create standard NumPy interface which supports
> both CUDA and OpenCL, and more devices if available.

I think this initiative already exists ... something which merges the
approach of cuda and opencl but I have no idea on the momentum behind
it.

> Are there other reason not to support GPU on NumPy?

yes. Matlab has such support and the performances gain are in the order
of 2x vs 10x when addressing the GPU directly. All the time is spent in
sending data back & forth. Numba is indeed a good candidate bu limited
to the PTX assembly (i.e. cuda, hence nvidia hardware) 

Cheers,

Jerome
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Changing the return type of np.histogramdd

2018-04-10 Thread Jerome Kieffer
> Either we should:
> 
>1. Just change it, and hope no one is broken by it
>2. Add a dtype argument:
>   - If dtype=None, behave like np.histogram
>   - If dtype is not specified, emit a future warning recommending to
>   use dtype=None or dtype=float
>   - In future, change the default to None
>3. Create a new better-named function histogram_nd, which can also be
>created without the mistake that is
>https://github.com/numpy/numpy/issues/10864.
> 
> Thoughts?

I like the option 2.

By the way, we (@ESRF) re-developped many times histogram and
histogram_nd in various projects in order to have a better consistency
on the one hand and better performances on the other (re-written in C
or C++).

I noticed a noticeable gain in performance in the last years of numpy
but I did not check consistency. The issue is that every bin should be
an interval open on the right-hand side which causes stability issues
depending as the smallest value greater than the max depend on the
input dtype.
For example the smallest value greater than 10 is 11
in int but 10.01 in float32 and 10.002 in float64.

Cheers,
-- 
Jérôme Kieffer

___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Adding a hex version like PY_VERSION_HEX

2018-10-05 Thread Jerome Kieffer
On Fri, 5 Oct 2018 11:31:20 +0300
Matti Picus  wrote:

> In PR 12074 https://github.com/numpy/numpy/pull/12074 I propose adding a 
> function `version.get_numpy_version_as_hex()` which returns a hex value 
> to represent the current NumPy version MAJOR.MINOR.MICRO where
> 
> v = hex(MAJOR << 24 | MINOR << 16 | MICRO)

+1

We use it in our code and it is a good practice, much better then 0.9.0>0.10.0 !

We added some support for dev, alpha, beta, RC and final versions in 
https://github.com/silx-kit/silx/blob/master/version.py

Cheers,
-- 
Jérôme Kieffer
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] NEP 38 - Universal SIMD intrinsics

2020-02-13 Thread Jerome Kieffer
On Wed, 12 Feb 2020 19:36:10 +
"Devulapalli, Raghuveer"  wrote:


> j. _mm512_sqrt_ps/pd (I could be wrong on this one, but from the little 
> google search I did, it seems like power ISA doesn’t have a vectorized sqrt 
> instruction)

Hi,
starting at Power7 (we are at Power9), the sqrt is available both in single and 
double precision: 

https://www.ibm.com/support/knowledgecenter/SSGH2K_12.1.0/com.ibm.xlc121.aix.doc/compiler_ref/vec_sqrt.html

Cheers,

-- 
Jérôme Kieffer
tel +33 476 882 445
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Comment published in Nature Astronomy about The ecological impact of computing with Python

2020-11-24 Thread Jerome Kieffer
Hi Pierre,

I agree with your point of view: the author wants to demonstrate C++
and Fortran are better than Python... and environmentally speaking he
has some evidences.

We develop with Python, Cython, Numpy, and OpenCL and what annoys me
most is the compilation time needed for the development of those
statically typed ahead of time extensions (C++, C, Fortran).

Clearly the author wants to get his article viral and in a sense he
managed :). But he did not mention Julia / Numba and other JIT compiled
languages (including matlab ?) that are probably outperforming the
C++ / Fortran when considering the development time and test-time.
Beside this the OpenMP parallelism (implicitly advertized) is far from
scaling well on multi-socket systems and other programming paradigms
are needed to extract the best performances from spercomputers.

Cheers,

Jerome 

___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Showing by examples how Python-Numpy can be efficient even for computationally intensive tasks

2020-11-26 Thread Jerome Kieffer
On Thu, 26 Nov 2020 22:14:40 +0100 (CET)
PIERRE AUGIER  wrote:

> I changed the email subject because I'd like to focus less on CO2 (a very 
> interesting subject, but not my focus here) and more on computing...
> 

Hi Pierre,

We may turn the problem in another way around: one should more focus on
the algorithm than on the programming language. 

I would like to share with you one example, where we published how to
speed-up crystallographic computation written in Python. 
https://onlinelibrary.wiley.com/iucr/doi/10.1107/S1600576719008471

One referee asked us to validate vs C and Fortran equivalent code.
C code was as fast as Pythran or Cython and Fortran was still faster
(the std of the Fortran-compiled runtime was much smaller which allows
Fortran to be faster by 3 std !)

But I consider the difference to be marginal at this level !

If one considers the "Moore law", i.e. the time needed for
"performance" to double in different aspects of computing, one gets
18 to 24 month for the number of transistor in a processor, 18 years
for the compilers and 2 years (in average) for the development of new
algorithms. In this sense one should more focus on the algorithm used.

The Table 1 of the article is especially interesting: Pure Python is
10x slower than proper Numpy code, and parallel Pythran is 50x faster
than Numpy (on the given computer) but using the proper algorithm, i.e.
FFT in this case, is 13000x faster !

So I believe that Python, with its expressivity, helps much in
understanding the algorithm and hence to design faster code.

Cheers,

Jerome
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Numpy 1.20.1 availability

2021-03-15 Thread Jerome Kieffer
On Sun, 14 Mar 2021 10:14:13 +
Peter Cock  wrote:

> I'm impressed to see 17 million conda-forge numpy downloads, vs
> 'just' 2.5 million downloads of the default channel's package:

I doubt the download figures from conda are correct ... 

A couple of days after my software package has entered "conda-forge" its 
metric was already 2 orders of magnitude larger than any other distribution 
route:
pip, debian packages, ... Since I know the approximate size of the
community, I have some doubts on the figures.

I suspect downloads for CI are all accounted and none cached, ...

Cheers,

Jerome
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Floating point precision expectations in NumPy

2021-08-18 Thread Jerome Kieffer
I strongly agree with you Gregor:
* Best precision should remain the default. I lost months in finding
  the compiler option (in ICC) which switched to LA mode and broke all
  my calculations.
* I wonder how those SVML behaves on non-intel plateform ? Sleef
  provides the same approach but it works also on Power and ARM
  platforms (and is designed to be extended...).

Cheers,

Jerome


On Wed, 28 Jul 2021 12:13:44 +0200
Gregor Thalhammer  wrote:

> > Am 28.07.2021 um 01:50 schrieb Sebastian Berg :
> > 
> > Hi all,
> > 
> > there is a proposal to add some Intel specific fast math routine to
> > NumPy:
> > 
> >https://github.com/numpy/numpy/pull/19478  
> 
> Many years ago I wrote a package
> https://github.com/geggo/uvml
> that makes the VML, a fast implementation of transcendetal math functions, 
> available for numpy. Don’t know if it still compiles.
> It uses Intel VML, designed for processing arrays, not the SVML intrinsics. 
> By this it is less machine dependent (optimized implementations are selected 
> automatically depending on the availability of, e.g., SSE, AVX, or AVX512), 
> just link to a library. It compiles as an external module, can be activated 
> at runtime. 
> 
> Different precision models can be selected at runtime (globally). I thinks 
> Intel advocates to use the LA (low accuracy) mode as a good compromise 
> between performance and accuracy. Different people have strongly diverging 
> opinions about what to expect.
> 
> The speedups possibly gained by these approaches often vaporize in 
> non-benchmark applications, as for those functions performance is often 
> limited by memory bandwidth, unless all your data stays in CPU cache. By 
> default I would go for high accuracy mode, with option to switch to low 
> accuracy if one urgently needs the better performance. But then one should 
> use different approaches for speeding up numpy.
> 
> Gregor
> 
> 
> > 
> > part of numerical algorithms is that there is always a speed vs.
> > precision trade-off, giving a more precise result is slower.
> > 
> > So there is a question what the general precision expectation should be
> > in NumPy.  And how much is it acceptable to diverge in the
> > precision/speed trade-off depending on CPU/system?
> > 
> > I doubt we can formulate very clear rules here, but any input on what
> > precision you would expect or trade-offs seem acceptable would be
> > appreciated!
> > 
> > 
> > Some more details
> > -
> > 
> > This is mainly interesting e.g. for functions like logarithms,
> > trigonometric functions, or cubic roots.
> > 
> > Some basic functions (multiplication, addition) are correct as per IEEE
> > standard and give the best possible result, but these are typically
> > only correct within very small numerical errors.
> > 
> > This is typically measured as "ULP":
> > 
> > https://en.wikipedia.org/wiki/Unit_in_the_last_place
> > 
> > where 0.5 ULP would be the best possible result.
> > 
> > 
> > Merging the PR may mean relaxing the current precision slightly in some
> > places.  In general Intel advertises 4 ULP of precision (although the
> > actual precision for most functions seems better).
> > 
> > 
> > Here are two tables, one from glibc and one for the Intel functions:
> > 
> > https://www.gnu.org/software/libc/manual/html_node/Errors-in-Math-Functions.html
> > (Mainly the LA column) 
> > https://software.intel.com/content/www/us/en/develop/documentation/onemkl-vmperfdata/top/real-functions/measured-accuracy-of-all-real-vm-functions.html
> > 
> > 
> > Different implementation give different accuracy, but formulating some
> > guidelines/expectation (or referencing them) would be useful guidance. 
> > 
> > For basic 
> > 
> > ___
> > NumPy-Discussion mailing list
> > NumPy-Discussion@python.org
> > https://mail.python.org/mailman/listinfo/numpy-discussion  
> 
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion


-- 
Jérôme Kieffer
tel +33 476 882 445
___
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] Re: Intel's Short Vector Math Library has been merged as a git submodule

2021-10-11 Thread Jerome Kieffer
On Mon, 11 Oct 2021 18:04:58 +0300
Matti Picus  wrote:

> As SciPy already found out, some downstream libraries may need to tweak 
> their tolerances for some functions as a result of this PR. We wanted to 
> put it in early enough in the release cycle so that we can back it out 
> fully or partially if the accuracy degradation is too large, so please 
> speak up if you notice anything strange.

Thanks for warning in advance... now, we need find some computers to
test those versions. Do you know if it works "the same" with AVX2 ?
since most computers have AVX2 and for now you need the latest servers
to test AVX512.

Cheers,

Jerome
___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com


[Numpy-discussion] Re: deprecating numpy.distutils

2022-01-17 Thread Jerome Kieffer
Hi,

I should mention many of our software projects (fabio, pyFAI, ...) rely
on silx for the nested setup.py part, and not all are mentionned in this list. 
In the absence of alternative solution, we limit the version of
setup.py for now but we are willing to help in finding a solution (our
complete build stack would collapses).

Well the parallel build feature is a very much appreciated because the
serial compilation of our project take dozens of minutes and our
computers are HPC nodes with plenty of cores...

I find funny the way of first deprecate ... then look if any
alternative solution exists.

If I understand properly, your idea is to port the nested setup.py part
from numpy.distutils to setuptools ? Are people of setuptools already in the 
loop ?

Cheers,

Jerome

On Mon, 17 Jan 2022 10:57:40 +0100
Ralf Gommers  wrote:

> On Sun, Jan 16, 2022 at 10:40 PM Sandro Tosi  wrote:
> 
> > > 4. Write a docs page with migration info on how to deal with the  
> > deprecation, and update it before the 1.23.0 release.
> >
> > i think this will be extremely important. Just to give you the current
> > POV of debian projects using numpy.distutils:
> >
> > -
> > https://codesearch.debian.net/search?q=numpy.distutils+-package%3Anumpy&literal=0&perpkg=1
> > (there are 74 of them)
> > -
> > https://codesearch.debian.net/search?q=from+numpy+import+.*distutils&literal=0
> > (only one "from numpy import distutils")
> >
> > hope this could help you come up with a smooth deprecation process :)
> >  
> 
> Thanks Sandro, that is helpful. I had a look at all those 15 pages of
> search results:
> 
> PyPy: just a few patches for numpy <1.11.2 which can be dropped
> Silx: uses Configuration (i.e., nested setup.py's)
> Electrum: just a few patches for numpy <1.11.2 which can be dropped
> Setuptools: just a few patches for numpy <1.11.2 which can be dropped
> python-hdf4: nothing special, can be switched to setuptools
> pybigwig: uses `get_info('npymath')` to (optionally) build against numpy -
> this may need to be added to setuptools
> python-pot: uses private numpy.distutils compiler support; can be switched
> to setuptools or vendored
> petsc4py: small setup.py, but uses Fortran support and Configuration -
> should probably switch build system
> basemap: uses Extension, can just switch to setuptools
> openexr: just queries numpy include dirs in an incorrect fashion, should
> use `numpy.get_include`
> rocketcea: nothing special, can just switch to setuptools
> brian: no usage, just a string `numpy.distutils` in docs
> libvigraimpex: just queries numpy include dirs in an incorrect fashion,
> should use `numpy.get_include`
> gpaw: some weird monkeypatching in shell scripts of numpy's BLAS/LAPACK
> support, can probably just be dropped
> pyferret: uses Extension, can just switch to setuptools
> adolc: just queries numpy include dirs in an incorrect fashion, should use
> `numpy.get_include`
> theano: uses BLAS support, they should probably vendor `system_info.py` (or
> switch build systems)
> scikit-image: uses Configuration
> scikit-ffm: uses Configuration, but only two setup.py files - easy to
> consolidate in a single setup.py and switch to setuptools
> pywavelets: no longer uses numpy.distutils
> pybind11: just a code comment, no numpy.distutils usage
> sfepy: uses Configuration
> nibabel: only one usage in a test case, just drop it
> versioneer: just a mention in the docs
> cypari: just a patch for numpy < 1.11.2, can be dropped
> blender: just a patch that can be dropped
> scikit-learn: uses Configuration and build_ext modification for parallel
> compilation
> nipype: minor usage of Configuration, can be dropped
> guiqwt: uses Extension but no longer has Fortran code, so can switch to
> setuptools
> 
> 
> The summary of this is: if Configuration is added to setuptools, 90% of
> packages are fine switching to setuptools. Scikit-learn is planning to
> follow SciPy and move to Meson. Theano is the only real consumer of
> numpy.distutils BLAS/LAPACK support, which they should vendor if they don't
> want to switch build systems. There are no other serious issues here.
> 
> I'm sure there are some more packages out there relying on Fortran support
> in particular, they are best off moving to Meson or CMake.
> 
> Cheers,
> Ralf
___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com


[Numpy-discussion] Re: Performance mystery

2022-01-19 Thread Jerome Kieffer
On Wed, 19 Jan 2022 19:48:32 +0100
Francesc Alted  wrote:

> What puzzles me is that the timeit loops access `z` data 3*1
> times, which is plenty of time for doing the allocation (just should
> require just a single iteration).

Hi all,

Very interesting discussion ... IIRC, timeit does some clever tricks
like disabling the garbage collector, so for the first run of timeit, I
suspect it does 1 malloc (without associated free thus without the
ability of recycling any of those buffers).
On the second run, those buffer were allocated previously and kept by
Python for re-use which could explain the faster run.

This does not explain really the different between operating systems observed.

Cheers,

Jerome
___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com


[Numpy-discussion] Future of numpy.distutils

2022-06-10 Thread Jerome Kieffer
Dear Numpy developpers,

We are developing a set of scientific tools
(https://github.com/silx-kit) and all our build infrastructure is based
on `numpy.distutils` which apparently is going to disappear in the
coming years. Beside us, the `scipy` project was using it ... 

Ralf Gommers has ported `scipy` to build with `meson-python` and there
are apparently some sharp edges remaining, especially under windows and
macos. I wonder if you can comment on the sustainability of this
approach or if you would advice us another build tool.

Thanks for your insights

Cheers,

Jerome
___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com


[Numpy-discussion] Re: Future of numpy.distutils

2022-06-10 Thread Jerome Kieffer
Thanks a lot for your answers,

I can confirm that migrating a small project (still including some
cython) took less than a day to get boot-strapped.
Not everything works but the structure is there and it kind-of works.

On Fri, 10 Jun 2022 13:42:44 +0200
Ralf Gommers  wrote:

> On Fri, Jun 10, 2022 at 10:53 AM Matthew Brett 
> wrote:
> 
> > Hi,
> >
> > On Fri, Jun 10, 2022 at 9:41 AM Jerome Kieffer 
> > wrote:  
> > >
> > > Dear Numpy developpers,
> > >
> > > We are developing a set of scientific tools
> > > (https://github.com/silx-kit) and all our build infrastructure is based
> > > on `numpy.distutils` which apparently is going to disappear in the
> > > coming years. Beside us, the `scipy` project was using it ...
> > >
> > > Ralf Gommers has ported `scipy` to build with `meson-python` and there
> > > are apparently some sharp edges remaining, especially under windows and
> > > macos. I wonder if you can comment on the sustainability of this
> > > approach or if you would advice us another build tool.  
> >
> > I am sure that Ralf will say more, but I did some of the work getting
> > the macOS and Windows builds working with Meson.   I am not sure what
> > sharp edges you are thinking of.  As you can imagine, the macOS
> > implementation was pretty straightforward, the Windows one less so,
> > but for the usual reasons, of differences between the MSVC compiler
> > and gcc toolchains.  But even there, it turned out that the modern
> > Windows gcc toolchains were up to the task, so we had to make
> > relatively few changes.   And, as I'm sure you know, Scipy has
> > relatively complex build requirements.
> >  
> 
> There are issues left, tracked under
> https://github.com/scipy/scipy/issues?q=is%3Aopen+is%3Aissue+label%3AMeson.
> None of those are fundamental, they are "just work". Some are not urgent
> (like gcov integration), other will be fixed for the SciPy 1.9.0 release
> over the next month or so. I have advised other projects who may be
> interested to wait till mid-July, because by then we should have SciPy
> 1.9.0 out and have surfaced any remaining potential issues. But you can
> start now, there won't be anything unsolvable.

I took a lot of inspiration from your work on scipy ...

> Most of the difficulties are unrelated to Meson (which is vastly superior
> to distutils/setuptools) itself, but more to:
> 1. Fortran on Windows
> 2. Cython not playing well with out-of-place builds
> 3. The `pyproject.toml` based integration (PEP 517, 518, 621) with Pip &
> co, and that whole design being a little immature
>
> You don't seem to have Fortran code in your project as far as I can tell,
> which relieves you of the biggest headache. Cython support will improve, I
> think the main thing we need is https://github.com/cython/cython/pull/4548
> (a fairly simple patch).

Indeed, no Fortran on our-side, only C/C++/OpenCL/Cuda, that's enough !

> NumPy itself should move to Meson as well. The main thing in
> `numpy.distutils` that NumPy uses but SciPy doesn't - and hence needs work
> - is SIMD support. I expect we need some improvements in Meson itself,
> what's in https://mesonbuild.com/Simd-module.html#unstable-simd-module is
> not enough. But also there, I don't expect any hard blockers.
> 

We are lucky, since SIMD stuff is handled at low-level by the
JIT-compilation for GPU.

> > So my guess is that you won't have much trouble getting the Mac and
> > Windows builds working once you've ported your Linux builds to Meson.
> >  I'd be happy to help with any problems you do run into.
> >  
> 
> +1 same here

Thanks a lot for paving the road.


Cheers,

Jerome
___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com


[Numpy-discussion] Re: Future of numpy.distutils

2022-06-13 Thread Jerome Kieffer
Hi Evgeni,

Thanks for your input, apparently, you project uses `meson-pep517`
while scipy uses `meson-python` for interfacing meson with the python
side of the building.

For now, I am not settled on one version or another but among the
python community there should be one and only one obvious way to do
things ... 

Meson does its job properly, the binary extensions are properly
compiled, but neither on my code, nor on scipy, those shared library
are shipped within the wheel.
There are several warnings like this (and many more when building scipy):

WARNING Using heuristics to map files to wheel, this may result in incorrect 
locations
WARNING File could not be mapped to an equivalent wheel directory: 
/tmp/build-via-sdist-jsg87qg9/fabio-0.15.0-a0/.mesonpy-5bwmoite/install/usr/lib/python3/dist-packages/fabio/ext/cf_io.cpython-39-x86_64-linux-gnu.so
 ({moduledir_shared}/cf_io.cpython-39-x86_64-linux-gnu.so)

Is this bug fixed is any "unreleased" version of `meson-python` which
would explain this works for some people but not for me ?
The simplest is probably to open a bug there.

Cheers,

Jerome
___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com


[Numpy-discussion] Re: Feature request: function to get minimum and maximum values simultaneously (as a tuple)

2022-07-01 Thread Jerome Kieffer
Hi,

There is an implementation in silx.math.combo.minmax():
https://github.com/silx-kit/silx/blob/master/src/silx/math/combo.pyx#L266

Cheers

Jerome

On Thu, 30 Jun 2022 22:50:12 +0200
"Andras Deak"  wrote:

> On Thu, Jun 30, 2022, at 22:23, Ewout ter Hoeven wrote:
> > A function to get the minimum and maximum values of an array 
> > simultaneously could be very useful, from both a convenience and 
> > performance point of view. Especially when arrays get larger the 
> > performance benefit could be significant, and even more if the array 
> > doesn't fit in L2/L3 cache or even memory.  
> 
> Hi,
> 
> There's an open issue asking for this feature: 
> https://github.com/numpy/numpy/issues/9836
> 
> András
> 
> >
> > There are many cases where not either the minimum or the maximum of an 
> > array is required, but both. Think of clipping an array, getting it's 
> > range, checking for outliers, normalizing, making a plot like a 
> > histogram, etc.
> >
> > This function could be called aminmax() for example, and also be called 
> > like ndarray.minmax(). It should return a tuple (min, max) with the 
> > minimum and maximum values of the array, identical to calling 
> > (ndarray.min(), ndarray.max()).
> >
> > With such a function, numpy.ptp() and the special cases of 
> > numpy.quantile(a, q=[0,1]) and numpy.percentile(a, q=[0,100]) could 
> > also potentially be speeded up, among others.
> >
> > Potentially argmin and argmax could get the same treatment, being 
> > called argminmax().
> >
> > There is also a very extensive post on Stack Overflow (a bit old 
> > already) with discussion and benchmarks: 
> > https://stackoverflow.com/questions/12200580/numpy-function-for-simultaneous-max-and-min
> > ___
> > NumPy-Discussion mailing list -- numpy-discussion@python.org
> > To unsubscribe send an email to numpy-discussion-le...@python.org
> > https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
> > Member address: deak.and...@gmail.com  
> ___
> NumPy-Discussion mailing list -- numpy-discussion@python.org
> To unsubscribe send an email to numpy-discussion-le...@python.org
> https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
> Member address: jerome.kief...@esrf.fr
___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com


[Numpy-discussion] Re: Passing output array to bincount

2022-10-22 Thread Jerome Kieffer
On Thu, 20 Oct 2022 23:26:37 -
ntess...@pm.me wrote:

> As far as I can see, there is no equivalent numpy functionality. In fact, as 
> far as I'm aware, there isn't any fast alternative outside of 
> C/Cython/numba/..

We have cummulative histograms in silx ... and found it useful. Maybe it would 
worth having it in numpy.
https://github.com/silx-kit/silx/blob/master/src/silx/math/chistogramnd.pyx#L110

Cheers,

Jerome
___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com


[Numpy-discussion] Re: 1D ndarray to java double[]

2023-01-01 Thread Jerome Kieffer
On Sat, 31 Dec 2022 23:45:54 -0800
Bill Ross  wrote:

> How best to write a 1D ndarray as a block of doubles, for reading in
> java as double[] or a stream of double? 
> 
> Maybe the performance of simple looping over doubles in python.write()
> and java.read() is fine, but maybe there are representational diffs?
> Maybe there's a better solution for the use case?  

Java is known to be big-endian ... but your CPU is probably little-endian.
Numpy has the tools to represent an array of double BE. 

> Use case: I get the ndarray from keras, and it represents a 2D distance
> matrix. I want to find the top-50 matches for each item, per row and
> column. I'm looking at moving the top-50 task to java for its superior
> parallel threading. (Java doesn't fork processes with a copy of the
> array, which is ~5% of memory; rather one gets 1 process with e.g. 1475%
> CPU.) 

What about numba or cython then ?

Happy new year

Jerome
___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com


[Numpy-discussion] Re: 1D ndarray to java double[]

2023-01-01 Thread Jerome Kieffer
On Sun, 01 Jan 2023 05:31:55 -0800
Bill Ross  wrote:

> Thanks! 
> 
> > Java is known to be big-endian ... your CPU is probably little-endian.   
> 
> $ lscpu | grep -i endian
> Byte Order: Little Endian 
> 
> > Numpy has the tools to represent an array of double BE.   
> 
> Is there a lower-level ndarray method that writes an array that could be
> used this way? 

One example:

numpy.array([1,2,3], dtype=">d").tobytes()
b'?\xf0\x00\x00\x00\x00\x00\x00@\x00\x00\x00\x00\x00\x00\x00@\x08\x00\x00\x00\x00\x00\x00'

numpy.array([1,2,3], dtype="https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com


[Numpy-discussion] Re: Precision changes to sin/cos in the next release?

2023-05-30 Thread Jerome Kieffer
Hi Sebastian,

I had a quick look at the PR and it looks like you re-implemented the sin-cos
function using SIMD. 
I wonder how it compares with SLEEF (header only library,
CPU-architecture agnostic SIMD implementation of transcendental
functions with precision validation). SLEEF is close to the Intel SVML
library in spirit  but extended to multi-architecture (tested on PowerPC
and ARM for example).
This is just curiosity ... 

Like Juan, I am afraid of this change since my code, which depends on
numpy for sin/cos used for rotation is likely to see large change of
behavior.

Cheers,

Jerome

On Wed, 31 May 2023 07:55:34 +0200
Sebastian Berg  wrote:

> Hi all,
> 
> there was recently a PR to NumPy to improve the performance of sin/cos
> on most platforms (on my laptop it seems to be about 5x on simple
> inputs).
> This changes the error bounds on platforms that were not previously
> accelerated (most users):
> 
> https://github.com/numpy/numpy/pull/23399
> 
> The new error is <4 ULP similar to what it was before, but only on high
> end Intel CPUs which not users would have noticed.
> And unfortunately, it is a bit unclear whether this is too disruptive
> or not.
> 
> The main surprise is probably that the range of both does not include 1
> (and -1) exactly with this and quite a lot of downstream packages
> noticed this and needed test adaptions.
> 
> Now, most of these are harmless: users shouldn't expect exact results
> from floating point math and test tolerances need adjustment.  OTOH,
> sin/cos are practically 1/-1 on a wide range of inputs (they are
> basically constant) so it is surprising that they deviate from it and
> never reach 1/-1 exactly.
> 
> Since quite a few downstream libs notice this and NumPy users cannot
> explicitly opt-in to a different performance/precision trade-off.  The
> question is how everyone feels about it being better to revert for now
> and hope for a better one?
> 
> I doubt we can decide on a very clear cut yes/no, but I am very
> interested what everyone thinks whether this precision trade-off is too
> surprising to users?
> 
> Cheers,
> 
> Sebastian
> 
> 
> ___
> NumPy-Discussion mailing list -- numpy-discussion@python.org
> To unsubscribe send an email to numpy-discussion-le...@python.org
> https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
> Member address: jerome.kief...@esrf.fr


-- 
Jérôme Kieffer
tel +33 476 882 445
___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com


[Numpy-discussion] Re: Precision changes to sin/cos in the next release?

2023-05-31 Thread Jerome Kieffer
On Wed, 31 May 2023 15:59:45 +0300
Matti Picus  wrote:

> There is a discussion about a runtime context variable/manager that 
> would extend errorstate to have a precision flag as well in 
> https://github.com/numpy/numpy/issues/23362.

I like this idea ...

-- 
Jérôme Kieffer
tel +33 476 882 445
___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com


[Numpy-discussion] Re: mean_std function returning both mean and std

2023-06-01 Thread Jerome Kieffer
On Wed, 31 May 2023 19:15:12 -
"Ronald van Elburg"  wrote:

> I created a solution  for ENH: Computing std/var and mean at the same time, 
> issue #23741. The solution can be found here: 
> https://github.com/soundappraisal/numpy/tree/stdmean-dev-001 

I like the idea ... so often one calculates both of them and the second needs 
the first.

-- 
Jérôme Kieffer
tel +33 476 882 445
___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com


[Numpy-discussion] Re: mean_std function returning both mean and std

2023-06-02 Thread Jerome Kieffer
On Fri, 02 Jun 2023 11:47:14 -
"Ronald van Elburg"  wrote:

> Aha, the unnecessary copy mentioned in the  
> https://dbs.ifi.uni-heidelberg.de/files/Team/eschubert/publications/SSDBM18-covariance-authorcopy.pdf.
>  paper is a copy of the input. Here it is about discarding a valuable output 
> (the mean) and then calculating that result separately.

I have been working a lot around this publication and I found it very 
interesting.
Nevertheless, I believe there is a bug when dealing with weighted
averages (eq22) ... but we can discuss offline about it. None of the
author did answer to my comments.

Since the PR is about unweighted means/std, the math exposed there are (very 
likely) correct.

Cheers,

-- 
Jérôme Kieffer
tel +33 476 882 445
___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com


[Numpy-discussion] Re: mean_std function returning both mean and std

2023-06-03 Thread Jerome Kieffer
On Fri, 02 Jun 2023 21:42:51 -
"Ronald van Elburg"  wrote:

> I had a closer look at the paper. When I have more brain and time I
> may check the mathematics. The focus is however more on streaming
> data, which is an application with completely different demands. I
> think that here we can not afford to sample the data, which is an
> option in streaming database systems.

To be more precise, the "bug" I spotted is not in the math, per se, but
in some the definitions which prevent some simplifaction later on ...

In table 1, I believe the weights should be squared when calculating 
(co-)variance VWp and one should normalize with the sum of omega squared (Eq 
13) ... 
later on the simplication leading to Eq 21 and 22 no not take take
place any-longer :(


The demonstration for the provided formula to be wrong is simple:
multiply all weights by any (very large or very small) number modifies much the 
variance and it should not.

Cheers,

Jerome
___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com


[Numpy-discussion] Re: Bumping CPU baseline to x86-64-v2

2025-05-18 Thread Jerome Kieffer
Hi,

First I would like to highlight that "X86_V2" is very specific to the
implementation in numpy, basically this implies the support of all 128
bits SIMD extensions, i.e. SSE1,2,3&4, but does apparenty not requires
any 256bit SIMD (AVX).

Ralph, the stats you are using are made on end-user hardware which gets
replaced much faster than servers, while numpy is very likely to be
used on elder hardware, especially for CI where elder hardware still ok
and often recycled. For example, we are still renting severs which is
X86_V2 but not X86_V3 for  performing the CI/CD of our projects.

Cheers,

Jerome

On Sat, 17 May 2025 20:55:00 +0200
Ralf Gommers via NumPy-Discussion  wrote:

> On Sat, May 17, 2025 at 3:00 PM  wrote:
> 
> > Hi Sayed,
> >
> > I'm a bit confused: does your suggested change mean that prre-2009
> > processors won't work at all, or that no use will be made of the (little)
> > acceleration that they provided?  The latter seems fine, but not working at
> > all seems rather bad.  
> 
> 
> I don't think it's that bad. It's always a tradeoff between performance,
> binary size, effort, and compatibility. We've had this discussion before,
> e.g. when moving to SSE3 as the baseline, and decided that we should not
> drop support for a feature if less than 99.5% of CPUs in the wild has
> support, but if it's above we can do it. Previous bumps have not given any
> significant pushback from our user base.
> 
> >From https://github.com/numpy/numpy/issues/27851, half a year ago:  
> 
> *The most widely used data source for determining what hardware is out
> there is, I believe,
> https://store.steampowered.com/hwsurvey/?platform=combined
> . That
> currently says that SSE3 is at 100%, SSE4.1 at 99.78% and SSE4.2 at 99.70%.
> Meaning that if we bump the baseline up to SSE4.2, we'd only be dropping
> support for ~0.3% of systems with really old CPUs.*
> 
> SSE 4.2 support as of today increased to 99.78%, so the number of CPUs we'd
> drop support for moved from 0.3% to 0.22% in 6 months. That means we're
> still being very conservative after this bump, IIRC more so than when we
> dropped support for CPUs without SSE3.
> 
> Cheers,
> Ralf
> 
> 
> 
> > Though I'd think that for any old processors one can just fall back to the
> > standard `libm` implementations.
> >
> > All the best,
> >
> > Marten
> > ___
> > NumPy-Discussion mailing list -- numpy-discussion@python.org
> > To unsubscribe send an email to numpy-discussion-le...@python.org
> > https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
> > Member address: ralf.gomm...@gmail.com
> >  


-- 
Jérôme Kieffer
___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com


[Numpy-discussion] Re: Bumping CPU baseline to x86-64-v2

2025-06-13 Thread Jerome Kieffer
Hi Matti,

Sorry for the delay ...

In one of my project I am working on, we use based Avoton server
(Intel C2350) for CI/CD which can be rented today (2025) for less than 5€/month
at online.net (a french provider). Switching to more recent generation
of processor (E3 1245v5) would imply at least 30€/month budget which is
much more expensive.

Concerning the energy cost, I believe Online has well optimized their
cost and if they still offer this kind of server at such low price, it
probably means this processor is still doing its job (which I can
confirm) and that the manufacturing cost has already been paid of.

It is like the switch to electric cars: it is not because all car owners
would (miraculously) switch to electric cars that the climate issue
would be (miraculously) addressed.

Cheers,
-- 
Jérôme Kieffer


On Sun, 18 May 2025 13:00:37 +0300
matti picus via NumPy-Discussion  wrote:

> Interesting. Could you give some more information that might convince NumPy
> to continue supporting these old machines? Renting implies you do not own
> them and are paying for the service. Are the energy/speed tradeoffs worth
> continuing with them, rather than asking the hosting service for a more
> modern machine? Do they use Numpy2.x in the CI/CD pipeline?

___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3//lists/numpy-discussion.python.org
Member address: arch...@mail-archive.com


[Numpy-discussion] Re: Bumping CPU baseline to x86-64-v2

2025-06-14 Thread Jerome Kieffer
Hi Ralph,

There is no problem here except "any hardware more than 10yo* is
deprecated and we can discontinue its support" which is something Intel
would love but not me (nor anybody who realized we are living in a
finite world).

Cheers,

Jerome

PS: since 2003, the frequency of processors has reached a plateau ...
so very often the interest in replacing hardware is limited.

On Fri, 13 Jun 2025 16:56:31 +0200
Ralf Gommers via NumPy-Discussion  wrote:

> On Fri, Jun 13, 2025 at 11:07 AM Jerome Kieffer 
> wrote:
> 
> > Hi Matti,
> >
> > Sorry for the delay ...
> >
> > In one of my project I am working on, we use based Avoton server
> > (Intel C2350) for CI/CD which can be rented today (2025) for less than
> > 5€/month
> > at online.net (a french provider). Switching to more recent generation
> > of processor (E3 1245v5) would imply at least 30€/month budget which is
> > much more expensive.
> >  
> 
> That CPU was released in Q3 2013 and does support the new x86-64-v2
> baseline we are proposing (as you already said above). So there is no
> problem here, is there?
> 
> Cheers,
> Ralf


-- 
Jérôme Kieffer
___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3//lists/numpy-discussion.python.org
Member address: arch...@mail-archive.com