Re: [Numpy-discussion] Windows wheels, built, but should we deploy?

2016-03-04 Thread David Cournapeau
On Fri, Mar 4, 2016 at 4:42 AM, Matthew Brett 
wrote:

> Hi,
>
> Summary:
>
> I propose that we upload Windows wheels to pypi.  The wheels are
> likely to be stable and relatively easy to maintain, but will have
> slower performance than other versions of numpy linked against faster
> BLAS / LAPACK libraries.
>
> Background:
>
> There's a long discussion going on at issue github #5479 [1], where
> the old problem of Windows wheels for numpy came up.
>
> For those of you not following this issue, the current situation for
> community-built numpy Windows binaries is dire:
>
> * We have not so far provided windows wheels on pypi, so `pip install
> numpy` on Windows will bring you a world of pain;
> * Until recently we did provide .exe "superpack" installers on
> sourceforge, but these became increasingly difficult to build and we
> gave up building them as of the latest (1.10.4) release.
>
> Despite this, popularity of Windows wheels on pypi is high.   A few
> weeks ago, Donald Stufft ran a query for the binary wheels most often
> downloaded from pypi, for any platform [2] . The top five most
> downloaded were (n_downloads, name):
>
> 6646,
> numpy-1.10.4-cp27-none-macosx_10_6_intel.macosx_10_9_intel.macosx_10_9_x86_64.macosx_10_10_intel.macosx_10_10_x86_64.whl
> 5445, cryptography-1.2.1-cp27-none-win_amd64.whl
> 5243, matplotlib-1.4.0-cp34-none-win32.whl
> 5241, scikit_learn-0.15.1-cp34-none-win32.whl
> 4573, pandas-0.17.1-cp27-none-win_amd64.whl
>
> So a) the OSX numpy wheel is very popular and b) despite the fact that
> we don't provide a numpy wheel for Windows, matplotlib, sckit_learn
> and pandas, that depend on numpy, are the 3rd, 4th and 5th most
> downloaded wheels as of a few weeks ago.
>
> So, there seems to be a large appetite for numpy wheels.
>
> Current proposal:
>
> I have now built numpy wheels, using the ATLAS blas / lapack library -
> the build is automatic and reproducible [3].
>
> I chose ATLAS to build against, rather than, say OpenBLAS, because
> we've had some significant worries in the past about the reliability
> of OpenBLAS, and I thought it better to err on the side of
> correctness.
>
> However, these builds are relatively slow for matrix multiply and
> other linear algebra routines compared numpy built against OpenBLAS or
> MKL (which we cannot use because of its license) [4].   In my very
> crude array test of a dot product and matrix inversion, the ATLAS
> wheels were 2-3 times slower than MKL.  Other benchmarks on Julia
> found about the same result for ATLAS vs OpenBLAS on 32-bit bit, but a
> much bigger difference on 64-bit (for an earlier version of ATLAS than
> we are currently using) [5].
>
> So, our numpy wheels likely to be stable and give correct results, but
> will be somewhat slow for linear algebra.
>

I would not worry too much about this: at worst, this gives us back the
situation where we were w/ so-called superpack, which have been successful
in the past to spread numpy use on windows.

My main worry is whether this locks us into ATLAS  for a long time because
of package depending on numpy blas/lapack (scipy, scikit learn). I am not
sure how much this is the case.

David


>
> I propose that we upload these ATLAS wheels to pypi.  The upside is
> that this gives our Windows users a much better experience with pip,
> and allows other developers to build Windows wheels that depend on
> numpy.  The downside is that these will not be optimized for
> performance on modern processors.  In order to signal that, I propose
> adding the following text to the numpy pypi front page:
>
> ```
> All numpy wheels distributed from pypi are BSD licensed.
>
> Windows wheels are linked against the ATLAS BLAS / LAPACK library,
> restricted to SSE2 instructions, so may not give optimal linear
> algebra performance for your machine. See
> http://docs.scipy.org/doc/numpy/user/install.html for alternatives.
> ```
>
> In a way this is very similar to our previous situation, in that the
> superpack installers also used ATLAS - in fact an older version of
> ATLAS.
>
> Once we are up and running with numpy wheels, we can consider whether
> we should switch to other BLAS libraries, such as OpenBLAS or BLIS
> (see [6]).
>
> I'm posting here hoping for your feedback...
>
> Cheers,
>
> Matthew
>
>
> [1] https://github.com/numpy/numpy/issues/5479
> [2] https://gist.github.com/dstufft/1dda9a9f87ee7121e0ee
> [3] https://ci.appveyor.com/project/matthew-brett/np-wheel-builder
> [4] http://mingwpy.github.io/blas_lapack.html#intel-math-kernel-library
> [5] https://github.com/numpy/numpy/issues/5479#issuecomment-185033668
> [6] https://github.com/numpy/numpy/issues/7372
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] Ufunc identity for bitwise reduction of object arrays.

2016-03-04 Thread Charles R Harris
Hi All,

There is currently some discussion
 on whether or not object arrays
should have an identity for bitwise reductions. Currently, they do not use
the identity for non-empty arrays, so this would only affect reductions on
empty arrays. Currently bitwise_or, bitwise_xor, and bitwise_and will
return (bool_) 0, (bool_) 0, and (int) -1 respectively in that case. Note
the non-object arrays work as they should, the question is only about
object arrays.

Thougts?

Chuck
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Windows wheels, built, but should we deploy?

2016-03-04 Thread Matthew Brett
On Fri, Mar 4, 2016 at 12:29 AM, David Cournapeau  wrote:
>
>
> On Fri, Mar 4, 2016 at 4:42 AM, Matthew Brett 
> wrote:
>>
>> Hi,
>>
>> Summary:
>>
>> I propose that we upload Windows wheels to pypi.  The wheels are
>> likely to be stable and relatively easy to maintain, but will have
>> slower performance than other versions of numpy linked against faster
>> BLAS / LAPACK libraries.
>>
>> Background:
>>
>> There's a long discussion going on at issue github #5479 [1], where
>> the old problem of Windows wheels for numpy came up.
>>
>> For those of you not following this issue, the current situation for
>> community-built numpy Windows binaries is dire:
>>
>> * We have not so far provided windows wheels on pypi, so `pip install
>> numpy` on Windows will bring you a world of pain;
>> * Until recently we did provide .exe "superpack" installers on
>> sourceforge, but these became increasingly difficult to build and we
>> gave up building them as of the latest (1.10.4) release.
>>
>> Despite this, popularity of Windows wheels on pypi is high.   A few
>> weeks ago, Donald Stufft ran a query for the binary wheels most often
>> downloaded from pypi, for any platform [2] . The top five most
>> downloaded were (n_downloads, name):
>>
>> 6646,
>> numpy-1.10.4-cp27-none-macosx_10_6_intel.macosx_10_9_intel.macosx_10_9_x86_64.macosx_10_10_intel.macosx_10_10_x86_64.whl
>> 5445, cryptography-1.2.1-cp27-none-win_amd64.whl
>> 5243, matplotlib-1.4.0-cp34-none-win32.whl
>> 5241, scikit_learn-0.15.1-cp34-none-win32.whl
>> 4573, pandas-0.17.1-cp27-none-win_amd64.whl
>>
>> So a) the OSX numpy wheel is very popular and b) despite the fact that
>> we don't provide a numpy wheel for Windows, matplotlib, sckit_learn
>> and pandas, that depend on numpy, are the 3rd, 4th and 5th most
>> downloaded wheels as of a few weeks ago.
>>
>> So, there seems to be a large appetite for numpy wheels.
>>
>> Current proposal:
>>
>> I have now built numpy wheels, using the ATLAS blas / lapack library -
>> the build is automatic and reproducible [3].
>>
>> I chose ATLAS to build against, rather than, say OpenBLAS, because
>> we've had some significant worries in the past about the reliability
>> of OpenBLAS, and I thought it better to err on the side of
>> correctness.
>>
>> However, these builds are relatively slow for matrix multiply and
>> other linear algebra routines compared numpy built against OpenBLAS or
>> MKL (which we cannot use because of its license) [4].   In my very
>> crude array test of a dot product and matrix inversion, the ATLAS
>> wheels were 2-3 times slower than MKL.  Other benchmarks on Julia
>> found about the same result for ATLAS vs OpenBLAS on 32-bit bit, but a
>> much bigger difference on 64-bit (for an earlier version of ATLAS than
>> we are currently using) [5].
>>
>> So, our numpy wheels likely to be stable and give correct results, but
>> will be somewhat slow for linear algebra.
>
>
> I would not worry too much about this: at worst, this gives us back the
> situation where we were w/ so-called superpack, which have been successful
> in the past to spread numpy use on windows.
>
> My main worry is whether this locks us into ATLAS  for a long time because
> of package depending on numpy blas/lapack (scipy, scikit learn). I am not
> sure how much this is the case.

You mean the situation where other packages try to find the BLAS /
LAPACK library and link against that?   My impression was that neither
scipy or scikit-learn do that at the moment, but I'm happy to be
corrected.

You'd know better than me about this, but my understanding is that
BLAS / LAPACK has a standard interface that should allow code to run
the same way, regardless of which BLAS / LAPACK library it is linking
to.  So, even if another package is trying to link against the numpy
BLAS, swapping the numpy BLAS library shouldn't cause a problem
(unless the package is trying to link to ATLAS-specific stuff, which
seems a bit unlikely).

Is that right?

Cheers,

Matthew
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Windows wheels, built, but should we deploy?

2016-03-04 Thread Chris Barker
+1 -- thanks for doing all this work.

There is a HUGE amount you can do with numpy that doesn't give a whit about
how fast .dot() et all are. If you really do need that to be fast as
possible, you can pug in a faster build later.

This is great.

Just as one example -- I teach a general python class every year --I do
only one session on numpy/scipy. If I can expect my students to be able to
simply pip install the core scipy stack, this will be SO much easier.

-CHB


On Thu, Mar 3, 2016 at 8:42 PM, Matthew Brett 
wrote:

> Hi,
>
> Summary:
>
> I propose that we upload Windows wheels to pypi.  The wheels are
> likely to be stable and relatively easy to maintain, but will have
> slower performance than other versions of numpy linked against faster
> BLAS / LAPACK libraries.
>
> Background:
>
> There's a long discussion going on at issue github #5479 [1], where
> the old problem of Windows wheels for numpy came up.
>
> For those of you not following this issue, the current situation for
> community-built numpy Windows binaries is dire:
>
> * We have not so far provided windows wheels on pypi, so `pip install
> numpy` on Windows will bring you a world of pain;
> * Until recently we did provide .exe "superpack" installers on
> sourceforge, but these became increasingly difficult to build and we
> gave up building them as of the latest (1.10.4) release.
>
> Despite this, popularity of Windows wheels on pypi is high.   A few
> weeks ago, Donald Stufft ran a query for the binary wheels most often
> downloaded from pypi, for any platform [2] . The top five most
> downloaded were (n_downloads, name):
>
> 6646,
> numpy-1.10.4-cp27-none-macosx_10_6_intel.macosx_10_9_intel.macosx_10_9_x86_64.macosx_10_10_intel.macosx_10_10_x86_64.whl
> 5445, cryptography-1.2.1-cp27-none-win_amd64.whl
> 5243, matplotlib-1.4.0-cp34-none-win32.whl
> 5241, scikit_learn-0.15.1-cp34-none-win32.whl
> 4573, pandas-0.17.1-cp27-none-win_amd64.whl
>
> So a) the OSX numpy wheel is very popular and b) despite the fact that
> we don't provide a numpy wheel for Windows, matplotlib, sckit_learn
> and pandas, that depend on numpy, are the 3rd, 4th and 5th most
> downloaded wheels as of a few weeks ago.
>
> So, there seems to be a large appetite for numpy wheels.
>
> Current proposal:
>
> I have now built numpy wheels, using the ATLAS blas / lapack library -
> the build is automatic and reproducible [3].
>
> I chose ATLAS to build against, rather than, say OpenBLAS, because
> we've had some significant worries in the past about the reliability
> of OpenBLAS, and I thought it better to err on the side of
> correctness.
>
> However, these builds are relatively slow for matrix multiply and
> other linear algebra routines compared numpy built against OpenBLAS or
> MKL (which we cannot use because of its license) [4].   In my very
> crude array test of a dot product and matrix inversion, the ATLAS
> wheels were 2-3 times slower than MKL.  Other benchmarks on Julia
> found about the same result for ATLAS vs OpenBLAS on 32-bit bit, but a
> much bigger difference on 64-bit (for an earlier version of ATLAS than
> we are currently using) [5].
>
> So, our numpy wheels likely to be stable and give correct results, but
> will be somewhat slow for linear algebra.
>
> I propose that we upload these ATLAS wheels to pypi.  The upside is
> that this gives our Windows users a much better experience with pip,
> and allows other developers to build Windows wheels that depend on
> numpy.  The downside is that these will not be optimized for
> performance on modern processors.  In order to signal that, I propose
> adding the following text to the numpy pypi front page:
>
> ```
> All numpy wheels distributed from pypi are BSD licensed.
>
> Windows wheels are linked against the ATLAS BLAS / LAPACK library,
> restricted to SSE2 instructions, so may not give optimal linear
> algebra performance for your machine. See
> http://docs.scipy.org/doc/numpy/user/install.html for alternatives.
> ```
>
> In a way this is very similar to our previous situation, in that the
> superpack installers also used ATLAS - in fact an older version of
> ATLAS.
>
> Once we are up and running with numpy wheels, we can consider whether
> we should switch to other BLAS libraries, such as OpenBLAS or BLIS
> (see [6]).
>
> I'm posting here hoping for your feedback...
>
> Cheers,
>
> Matthew
>
>
> [1] https://github.com/numpy/numpy/issues/5479
> [2] https://gist.github.com/dstufft/1dda9a9f87ee7121e0ee
> [3] https://ci.appveyor.com/project/matthew-brett/np-wheel-builder
> [4] http://mingwpy.github.io/blas_lapack.html#intel-math-kernel-library
> [5] https://github.com/numpy/numpy/issues/5479#issuecomment-185033668
> [6] https://github.com/numpy/numpy/issues/7372
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>



-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NO

Re: [Numpy-discussion] GSoC?

2016-03-04 Thread Pauli Virtanen
Thu, 11 Feb 2016 00:02:52 +0100, Ralf Gommers kirjoitti:
[clip]
> OK first version:
> https://github.com/scipy/scipy/wiki/GSoC-2016-project-ideas I kept some
> of the ideas from last year, but removed all potential mentors as the
> same people may not be available this year - please re-add yourselves
> where needed.
> 
> And to everyone who has a good idea, and preferably is willing to mentor
> for that idea: please add it to that page.

I probably don't have bandwidth for mentoring, but as the Numpy 
suggestions seem to be mostly "hard" problems, we can add another 
one:

## Dealing with overlapping input/output data

Numpy operations where output arrays overlap with 
input arrays can produce unexpected results.
A simple example is
```
x = np.arange(100*100).reshape(100,100)
x += x.T# <- undefined result!
```
The task is to change Numpy so that the results
here become similar to as if the input arrays
overlapping with output were separate (here: `x += x.T.copy()`).
The challenge here lies in doing this without sacrificing 
too much performance or memory efficiency.

Initial steps toward solving this problem were taken in
https://github.com/numpy/numpy/pull/6166
where a simplest available algorithm for detecting
if arrays overlap was added. However, this is not yet
utilized in ufuncs. An initial attempt to sketch what 
should be done is at https://github.com/numpy/numpy/issues/6272
and issues referenced therein.

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Windows wheels, built, but should we deploy?

2016-03-04 Thread josef.pktd
On Fri, Mar 4, 2016 at 1:38 PM, Matthew Brett 
wrote:

> On Fri, Mar 4, 2016 at 12:29 AM, David Cournapeau 
> wrote:
> >
> >
> > On Fri, Mar 4, 2016 at 4:42 AM, Matthew Brett 
> > wrote:
> >>
> >> Hi,
> >>
> >> Summary:
> >>
> >> I propose that we upload Windows wheels to pypi.  The wheels are
> >> likely to be stable and relatively easy to maintain, but will have
> >> slower performance than other versions of numpy linked against faster
> >> BLAS / LAPACK libraries.
> >>
> >> Background:
> >>
> >> There's a long discussion going on at issue github #5479 [1], where
> >> the old problem of Windows wheels for numpy came up.
> >>
> >> For those of you not following this issue, the current situation for
> >> community-built numpy Windows binaries is dire:
> >>
> >> * We have not so far provided windows wheels on pypi, so `pip install
> >> numpy` on Windows will bring you a world of pain;
> >> * Until recently we did provide .exe "superpack" installers on
> >> sourceforge, but these became increasingly difficult to build and we
> >> gave up building them as of the latest (1.10.4) release.
> >>
> >> Despite this, popularity of Windows wheels on pypi is high.   A few
> >> weeks ago, Donald Stufft ran a query for the binary wheels most often
> >> downloaded from pypi, for any platform [2] . The top five most
> >> downloaded were (n_downloads, name):
> >>
> >> 6646,
> >>
> numpy-1.10.4-cp27-none-macosx_10_6_intel.macosx_10_9_intel.macosx_10_9_x86_64.macosx_10_10_intel.macosx_10_10_x86_64.whl
> >> 5445, cryptography-1.2.1-cp27-none-win_amd64.whl
> >> 5243, matplotlib-1.4.0-cp34-none-win32.whl
> >> 5241, scikit_learn-0.15.1-cp34-none-win32.whl
> >> 4573, pandas-0.17.1-cp27-none-win_amd64.whl
> >>
> >> So a) the OSX numpy wheel is very popular and b) despite the fact that
> >> we don't provide a numpy wheel for Windows, matplotlib, sckit_learn
> >> and pandas, that depend on numpy, are the 3rd, 4th and 5th most
> >> downloaded wheels as of a few weeks ago.
> >>
> >> So, there seems to be a large appetite for numpy wheels.
> >>
> >> Current proposal:
> >>
> >> I have now built numpy wheels, using the ATLAS blas / lapack library -
> >> the build is automatic and reproducible [3].
> >>
> >> I chose ATLAS to build against, rather than, say OpenBLAS, because
> >> we've had some significant worries in the past about the reliability
> >> of OpenBLAS, and I thought it better to err on the side of
> >> correctness.
> >>
> >> However, these builds are relatively slow for matrix multiply and
> >> other linear algebra routines compared numpy built against OpenBLAS or
> >> MKL (which we cannot use because of its license) [4].   In my very
> >> crude array test of a dot product and matrix inversion, the ATLAS
> >> wheels were 2-3 times slower than MKL.  Other benchmarks on Julia
> >> found about the same result for ATLAS vs OpenBLAS on 32-bit bit, but a
> >> much bigger difference on 64-bit (for an earlier version of ATLAS than
> >> we are currently using) [5].
> >>
> >> So, our numpy wheels likely to be stable and give correct results, but
> >> will be somewhat slow for linear algebra.
> >
> >
> > I would not worry too much about this: at worst, this gives us back the
> > situation where we were w/ so-called superpack, which have been
> successful
> > in the past to spread numpy use on windows.
> >
> > My main worry is whether this locks us into ATLAS  for a long time
> because
> > of package depending on numpy blas/lapack (scipy, scikit learn). I am not
> > sure how much this is the case.
>
> You mean the situation where other packages try to find the BLAS /
> LAPACK library and link against that?   My impression was that neither
> scipy or scikit-learn do that at the moment, but I'm happy to be
> corrected.
>
> You'd know better than me about this, but my understanding is that
> BLAS / LAPACK has a standard interface that should allow code to run
> the same way, regardless of which BLAS / LAPACK library it is linking
> to.  So, even if another package is trying to link against the numpy
> BLAS, swapping the numpy BLAS library shouldn't cause a problem
> (unless the package is trying to link to ATLAS-specific stuff, which
> seems a bit unlikely).
>
> Is that right?
>


AFAIK, numpy doesn't provide access to BLAS/LAPACK. scipy does. statsmodels
is linking to the installed BLAS/LAPACK in cython code through scipy. So
far we haven't seen problems with different versions. I think scipy
development works very well to isolate linalg library version specific
parts from the user interface.

AFAIU, The main problem will be linking to inconsistent Fortran libraries
in downstream packages that use Fortran.
Eg. AFAIU it won't work to pip install a ATLAS based numpy and then install
a MKL based scipy from Gohlke.

I don't know if there is a useful error message, or if this just results in
puzzled users.

Josef



>
> Cheers,
>
> Matthew
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.

Re: [Numpy-discussion] Windows wheels, built, but should we deploy?

2016-03-04 Thread Nathaniel Smith
On Fri, Mar 4, 2016 at 7:30 PM,   wrote:
[...]
> AFAIK, numpy doesn't provide access to BLAS/LAPACK. scipy does. statsmodels
> is linking to the installed BLAS/LAPACK in cython code through scipy. So far
> we haven't seen problems with different versions. I think scipy development
> works very well to isolate linalg library version specific parts from the
> user interface.

Yeah, it should be invisible to users of both numpy and scipy which
BLAS/LAPACK is in use under the hood.

>
> AFAIU, The main problem will be linking to inconsistent Fortran libraries in
> downstream packages that use Fortran.
> Eg. AFAIU it won't work to pip install a ATLAS based numpy and then install
> a MKL based scipy from Gohlke.

The specific scenario you describe will be a problem, but not for the
reason you state -- the problem is that (IIUC) the Gohlke scipy build
has some specific hacks where it "knows" that it can find a copy of
MKL buried at a particular location inside the numpy package (and the
Gohlke numpy build has a specific hack to put a copy of MKL there). So
the Gohlke scipy requires the Gohlke numpy, but this is due to patches
that Christoph applies to his builds.

AFAIK, outside of downstream packages that poke around the inside of
numpy like this, there should be no way for downstream packages to
know or care which BLAS/LAPACK implementation numpy is using (except
for speed, bugs, etc.).

-n

-- 
Nathaniel J. Smith -- https://vorpus.org
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion