Re: [Numpy-discussion] Suggestion: special-case np.array(range(...)) to be faster

2016-02-14 Thread Antony Lee
re: no reason why...
This has nothing to do with Python2/Python3 (I personally stopped using
Python2 at least 3 years ago.)  Let me put it this way instead: if
Python3's "range" (or Python2's "xrange") was not a builtin type but a type
provided by numpy, I don't think it would be controversial at all to
provide an `__array__` special method to efficiently convert it to a
ndarray.  It would be the same if `np.array` used a
`functools.singledispatch` dispatcher rather than an `__array__` special
method (which is obviously not possible for chronological reasons).

re: iterable vs iterator: check for the presence of the __next__ special
method (or isinstance(x, Iterable) vs. isinstance(x, Iterator) and not
isinstance(x, Iterable))

Antony

2016-02-13 18:48 GMT-08:00 :

>
>
> On Sat, Feb 13, 2016 at 9:43 PM,  wrote:
>
>>
>>
>> On Sat, Feb 13, 2016 at 8:57 PM, Antony Lee 
>> wrote:
>>
>>> Compare (on Python3 -- for Python2, read "xrange" instead of "range"):
>>>
>>> In [2]: %timeit np.array(range(100), np.int64)
>>> 10 loops, best of 3: 156 ms per loop
>>>
>>> In [3]: %timeit np.arange(100, dtype=np.int64)
>>> 1000 loops, best of 3: 853 µs per loop
>>>
>>>
>>> Note that while iterating over a range is not very fast, it is still
>>> much better than the array creation:
>>>
>>> In [4]: from collections import deque
>>>
>>> In [5]: %timeit deque(range(100), 1)
>>> 10 loops, best of 3: 25.5 ms per loop
>>>
>>>
>>> On one hand, special cases are awful. On the other hand, the range
>>> builtin is probably important enough to deserve a special case to make this
>>> construction faster. Or not? I initially opened this as
>>> https://github.com/numpy/numpy/issues/7233 but it was suggested there
>>> that this should be discussed on the ML first.
>>>
>>> (The real issue which prompted this suggestion: I was building sparse
>>> matrices using scipy.sparse.csc_matrix with some indices specified using
>>> range, and that construction step turned out to take a significant portion
>>> of the time because of the calls to np.array).
>>>
>>
>>
>> IMO: I don't see a reason why this should be supported. There is
>> np.arange after all for this usecase, and from_iter.
>> range and the other guys are iterators, and in several cases we can use
>> larange = list(range(...)) as a short cut to get python list.for python 2/3
>> compatibility.
>>
>> I think this might be partially a learning effect in the python 2 to 3
>> transition. After using almost only python 3 for maybe a year, I don't
>> think it's difficult to remember the differences when writing code that is
>> py 2.7 and py 3.x compatible.
>>
>>
>> It's just **another** thing to watch out for if milliseconds matter in
>> your application.
>>
>
>
> side question: Is there a simple way to distinguish a iterator or
> generator from an iterable data structure?
>
> Josef
>
>
>
>>
>> Josef
>>
>>
>>>
>>> Antony
>>>
>>> ___
>>> NumPy-Discussion mailing list
>>> NumPy-Discussion@scipy.org
>>> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>>>
>>>
>>
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] ANN: numpydoc 0.6.0 released

2016-02-14 Thread Ralf Gommers
On Sat, Feb 13, 2016 at 10:38 PM,  wrote:

>
>
> On Sat, Feb 13, 2016 at 10:03 AM, Ralf Gommers 
> wrote:
>
>> Hi all,
>>
>> I'm pleased to announce the release of numpydoc 0.6.0. The main new
>> feature is support for the Yields section in numpy-style docstrings. This
>> is described in
>> https://github.com/numpy/numpy/blob/master/doc/HOWTO_DOCUMENT.rst.txt
>>
>> Numpydoc can be installed from PyPi:
>> https://pypi.python.org/pypi/numpydoc
>>
>
>
> Thanks,
>
> BTW: the status section in the howto still refers to the documentation
> editor, which has been retired AFAIK.
>

Thanks Josef. I sent a PR to remove that text.

Ralf
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Suggestion: special-case np.array(range(...)) to be faster

2016-02-14 Thread josef.pktd
On Sun, Feb 14, 2016 at 3:21 AM, Antony Lee  wrote:

> re: no reason why...
> This has nothing to do with Python2/Python3 (I personally stopped using
> Python2 at least 3 years ago.)  Let me put it this way instead: if
> Python3's "range" (or Python2's "xrange") was not a builtin type but a type
> provided by numpy, I don't think it would be controversial at all to
> provide an `__array__` special method to efficiently convert it to a
> ndarray.  It would be the same if `np.array` used a
> `functools.singledispatch` dispatcher rather than an `__array__` special
> method (which is obviously not possible for chronological reasons).
>
>
But numpy does provide arange.
What's the reason to not use np.arange and use an iterator instead?



> re: iterable vs iterator: check for the presence of the __next__ special
> method (or isinstance(x, Iterable) vs. isinstance(x, Iterator) and not
> isinstance(x, Iterable))
>

AFAIR and from spot checking the mailing list, in the past the argument was
that it's too complicated to mix array/asarray creation with fromiter
building of arrays.

(I have no idea if array could cheaply delegate to fromiter.)


Josef



>
> Antony
>
>
> 2016-02-13 18:48 GMT-08:00 :
>
>>
>>
>> On Sat, Feb 13, 2016 at 9:43 PM,  wrote:
>>
>>>
>>>
>>> On Sat, Feb 13, 2016 at 8:57 PM, Antony Lee 
>>> wrote:
>>>
 Compare (on Python3 -- for Python2, read "xrange" instead of "range"):

 In [2]: %timeit np.array(range(100), np.int64)
 10 loops, best of 3: 156 ms per loop

 In [3]: %timeit np.arange(100, dtype=np.int64)
 1000 loops, best of 3: 853 µs per loop


 Note that while iterating over a range is not very fast, it is still
 much better than the array creation:

 In [4]: from collections import deque

 In [5]: %timeit deque(range(100), 1)
 10 loops, best of 3: 25.5 ms per loop


 On one hand, special cases are awful. On the other hand, the range
 builtin is probably important enough to deserve a special case to make this
 construction faster. Or not? I initially opened this as
 https://github.com/numpy/numpy/issues/7233 but it was suggested there
 that this should be discussed on the ML first.

 (The real issue which prompted this suggestion: I was building sparse
 matrices using scipy.sparse.csc_matrix with some indices specified using
 range, and that construction step turned out to take a significant portion
 of the time because of the calls to np.array).

>>>
>>>
>>> IMO: I don't see a reason why this should be supported. There is
>>> np.arange after all for this usecase, and from_iter.
>>> range and the other guys are iterators, and in several cases we can use
>>> larange = list(range(...)) as a short cut to get python list.for python 2/3
>>> compatibility.
>>>
>>> I think this might be partially a learning effect in the python 2 to 3
>>> transition. After using almost only python 3 for maybe a year, I don't
>>> think it's difficult to remember the differences when writing code that is
>>> py 2.7 and py 3.x compatible.
>>>
>>>
>>> It's just **another** thing to watch out for if milliseconds matter in
>>> your application.
>>>
>>
>>
>> side question: Is there a simple way to distinguish a iterator or
>> generator from an iterable data structure?
>>
>> Josef
>>
>>
>>
>>>
>>> Josef
>>>
>>>

 Antony

 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 https://mail.scipy.org/mailman/listinfo/numpy-discussion


>>>
>>
>> ___
>> NumPy-Discussion mailing list
>> NumPy-Discussion@scipy.org
>> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>>
>>
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Suggestion: special-case np.array(range(...)) to be faster

2016-02-14 Thread Ralf Gommers
On Sun, Feb 14, 2016 at 9:21 AM, Antony Lee  wrote:

> re: no reason why...
> This has nothing to do with Python2/Python3 (I personally stopped using
> Python2 at least 3 years ago.)  Let me put it this way instead: if
> Python3's "range" (or Python2's "xrange") was not a builtin type but a type
> provided by numpy, I don't think it would be controversial at all to
> provide an `__array__` special method to efficiently convert it to a
> ndarray.  It would be the same if `np.array` used a
> `functools.singledispatch` dispatcher rather than an `__array__` special
> method (which is obviously not possible for chronological reasons).
>
> re: iterable vs iterator: check for the presence of the __next__ special
> method (or isinstance(x, Iterable) vs. isinstance(x, Iterator) and not
> isinstance(x, Iterable))
>

I think it's good to do something about this, but it's not clear what the
exact proposal is. I could image one or both of:

  - special-case the range() object in array (and asarray/asanyarray?) such
that array(range(N)) becomes as fast as arange(N).
  - special-case all iterators, such that array(range(N)) becomes as fast
as deque(range(N))

or yet something else?

Ralf
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Modulus (remainder) function corner cases

2016-02-14 Thread Nils Becker
2016-02-13 17:42 GMT+01:00 Charles R Harris :

> The Fortran modulo function, which is the same basic function as in my
>> branch, does not specify any bounds on the result for floating numbers, but
>> gives only the formula,  modulus(a, b) = a - b*floor(a/b), which has the
>> advantage of being simple and well defined ;)
>>
>
>
In the light of the libm-discussion I spent some time looking at floating
point functions and their accuracy. I would vote in favor of keeping an
implementation that uses the fmod-function of the system library and bends
it to adhere to the python convention (sign of divisor). There is probably
a reason why the fmod-implementation is not as simple as "a - b*floor(a/b)"
[1].

One obvious problem with the simple expression arises when a/b = 0.0 in
floating point. E.g.

In [43]: np.__version__
Out[43]: '1.10.4'
In [44]: x = np.float64(1e-320)
In [45]: y = np.float64(-1e10)
In [46]: x % y # this uses libm's fmod on my system
Out[46]: -100.0 # == y, correctly rounded result in round-to-nearest
In [47]: x - y*np.floor(x/y) # this here is the naive expression
Out[47]: 9.9998886718268301e-321  # == x, wrong sign

There are other problems (a/b = inf in floating point). As I do not
understand the implementation of fmod (for example in openlibm) in detail I
cannot give a full list of corner cases.

Unfortunately, I did not follow the (many different) bug reports on this
issue originally and am confused why there was a need to change the
implementation in the first place. numpy's "%" operator seems to work quite
well on my system. Therefore, this mail may be rather unproductive as I am
missing some information.

Concerning your original question: Many elementary functions loose their
mathematical properties when they are calculated correctly-rounded in
floating point numbers [2]. We do not fix this for other functions, I would
not fix it here.

Cheers
Nils

[1] https://github.com/JuliaLang/openlibm/blob/master/src/e_fmod.c
[2] np.exp(np.nextafter(1.0, 0.0)) < np.e -> False (Monotonicity lost in
exp(x)).
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Modulus (remainder) function corner cases

2016-02-14 Thread Charles R Harris
On Sun, Feb 14, 2016 at 12:35 PM, Nils Becker 
wrote:

> 2016-02-13 17:42 GMT+01:00 Charles R Harris :
>
>> The Fortran modulo function, which is the same basic function as in my
>>> branch, does not specify any bounds on the result for floating numbers, but
>>> gives only the formula,  modulus(a, b) = a - b*floor(a/b), which has
>>> the advantage of being simple and well defined ;)
>>>
>>
>>
> In the light of the libm-discussion I spent some time looking at floating
> point functions and their accuracy. I would vote in favor of keeping an
> implementation that uses the fmod-function of the system library and bends
> it to adhere to the python convention (sign of divisor). There is probably
> a reason why the fmod-implementation is not as simple as "a - b*floor(a/b)"
> [1].
>
> One obvious problem with the simple expression arises when a/b = 0.0 in
> floating point. E.g.
>
> In [43]: np.__version__
> Out[43]: '1.10.4'
> In [44]: x = np.float64(1e-320)
> In [45]: y = np.float64(-1e10)
> In [46]: x % y # this uses libm's fmod on my system
> Out[46]: -100.0 # == y, correctly rounded result in
> round-to-nearest
> In [47]: x - y*np.floor(x/y) # this here is the naive expression
> Out[47]: 9.9998886718268301e-321  # == x, wrong sign
>

But more accurate ;) Currently, this is actually clipped.

In [3]: remainder(x,y)
Out[3]: -0.0

In [4]: x - y*floor(x/y)
Out[4]: 9.9998886718268301e-321

In [5]: fmod(x,y)
Out[5]: 9.9998886718268301e-321



> There are other problems (a/b = inf in floating point). As I do not
> understand the implementation of fmod (for example in openlibm) in detail I
> cannot give a full list of corner cases.
>

?



Chuck
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Modulus (remainder) function corner cases

2016-02-14 Thread Charles R Harris
On Sun, Feb 14, 2016 at 12:54 PM, Charles R Harris <
charlesr.har...@gmail.com> wrote:

>
>
> On Sun, Feb 14, 2016 at 12:35 PM, Nils Becker 
> wrote:
>
>> 2016-02-13 17:42 GMT+01:00 Charles R Harris :
>>
>>> The Fortran modulo function, which is the same basic function as in my
 branch, does not specify any bounds on the result for floating
 numbers, but gives only the formula,  modulus(a, b) = a - b*floor(a/b),
 which has the advantage of being simple and well defined ;)

>>>
>>>
>> In the light of the libm-discussion I spent some time looking at floating
>> point functions and their accuracy. I would vote in favor of keeping an
>> implementation that uses the fmod-function of the system library and bends
>> it to adhere to the python convention (sign of divisor). There is probably
>> a reason why the fmod-implementation is not as simple as "a - b*floor(a/b)"
>> [1].
>>
>> One obvious problem with the simple expression arises when a/b = 0.0 in
>> floating point. E.g.
>>
>> In [43]: np.__version__
>> Out[43]: '1.10.4'
>> In [44]: x = np.float64(1e-320)
>> In [45]: y = np.float64(-1e10)
>> In [46]: x % y # this uses libm's fmod on my system
>>
>
I'm not too worried about denormals. However, this might be considered a
bug in the floor function

In [16]: floor(-1e-330)
Out[16]: -0.0

Chuck
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Suggestion: special-case np.array(range(...)) to be faster

2016-02-14 Thread Antony Lee
I was thinking (1) (special-case range()); however (2) may be more
generally applicable and useful.

Antony

2016-02-14 6:36 GMT-08:00 Ralf Gommers :

>
>
> On Sun, Feb 14, 2016 at 9:21 AM, Antony Lee 
> wrote:
>
>> re: no reason why...
>> This has nothing to do with Python2/Python3 (I personally stopped using
>> Python2 at least 3 years ago.)  Let me put it this way instead: if
>> Python3's "range" (or Python2's "xrange") was not a builtin type but a type
>> provided by numpy, I don't think it would be controversial at all to
>> provide an `__array__` special method to efficiently convert it to a
>> ndarray.  It would be the same if `np.array` used a
>> `functools.singledispatch` dispatcher rather than an `__array__` special
>> method (which is obviously not possible for chronological reasons).
>>
>> re: iterable vs iterator: check for the presence of the __next__ special
>> method (or isinstance(x, Iterable) vs. isinstance(x, Iterator) and not
>> isinstance(x, Iterable))
>>
>
> I think it's good to do something about this, but it's not clear what the
> exact proposal is. I could image one or both of:
>
>   - special-case the range() object in array (and asarray/asanyarray?)
> such that array(range(N)) becomes as fast as arange(N).
>   - special-case all iterators, such that array(range(N)) becomes as fast
> as deque(range(N))
>
> or yet something else?
>
> Ralf
>
>
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Modulus (remainder) function corner cases

2016-02-14 Thread Charles R Harris
On Sun, Feb 14, 2016 at 1:11 PM, Charles R Harris  wrote:

>
>
> On Sun, Feb 14, 2016 at 12:54 PM, Charles R Harris <
> charlesr.har...@gmail.com> wrote:
>
>>
>>
>> On Sun, Feb 14, 2016 at 12:35 PM, Nils Becker 
>> wrote:
>>
>>> 2016-02-13 17:42 GMT+01:00 Charles R Harris :
>>>
 The Fortran modulo function, which is the same basic function as in my
> branch, does not specify any bounds on the result for floating
> numbers, but gives only the formula,  modulus(a, b) = a - b*floor(a/b),
> which has the advantage of being simple and well defined ;)
>


>>> In the light of the libm-discussion I spent some time looking at
>>> floating point functions and their accuracy. I would vote in favor of
>>> keeping an implementation that uses the fmod-function of the system library
>>> and bends it to adhere to the python convention (sign of divisor). There is
>>> probably a reason why the fmod-implementation is not as simple as "a -
>>> b*floor(a/b)" [1].
>>>
>>> One obvious problem with the simple expression arises when a/b = 0.0 in
>>> floating point. E.g.
>>>
>>> In [43]: np.__version__
>>> Out[43]: '1.10.4'
>>> In [44]: x = np.float64(1e-320)
>>> In [45]: y = np.float64(-1e10)
>>> In [46]: x % y # this uses libm's fmod on my system
>>>
>>
> I'm not too worried about denormals. However, this might be considered a
> bug in the floor function
>
> In [16]: floor(-1e-330)
> Out[16]: -0.0
>
>
However, I do note that some languages offer two versions of modulus, one
floor based and the other trunc based (effectively fmod). What I wanted is
to keep the remainder consistent with the floor function in the C library.

Chuck
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Suggestion: special-case np.array(range(...)) to be faster

2016-02-14 Thread Charles R Harris
On Sun, Feb 14, 2016 at 7:36 AM, Ralf Gommers 
wrote:

>
>
> On Sun, Feb 14, 2016 at 9:21 AM, Antony Lee 
> wrote:
>
>> re: no reason why...
>> This has nothing to do with Python2/Python3 (I personally stopped using
>> Python2 at least 3 years ago.)  Let me put it this way instead: if
>> Python3's "range" (or Python2's "xrange") was not a builtin type but a type
>> provided by numpy, I don't think it would be controversial at all to
>> provide an `__array__` special method to efficiently convert it to a
>> ndarray.  It would be the same if `np.array` used a
>> `functools.singledispatch` dispatcher rather than an `__array__` special
>> method (which is obviously not possible for chronological reasons).
>>
>> re: iterable vs iterator: check for the presence of the __next__ special
>> method (or isinstance(x, Iterable) vs. isinstance(x, Iterator) and not
>> isinstance(x, Iterable))
>>
>
> I think it's good to do something about this, but it's not clear what the
> exact proposal is. I could image one or both of:
>
>   - special-case the range() object in array (and asarray/asanyarray?)
> such that array(range(N)) becomes as fast as arange(N).
>   - special-case all iterators, such that array(range(N)) becomes as fast
> as deque(range(N))
>

I think the last wouldn't help much, as numpy would still need to determine
dimensions and type.  I assume that is one of the reason sparse itself
doesn't do that.

Chuck
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] Numexpr-3.0 proposal

2016-02-14 Thread Robert McLeod
Hello everyone,

I've done some work on making a new version of Numexpr that would fix some
of the limitations of the original virtual machine with regards to data
types and operation/function count. Basically I re-wrote the Python and C
sides to use 4-byte words, instead of null-terminated strings, for
operations and passing types.  This means the number of operations and
types isn't significantly limited anymore.

Francesc Alted suggested I should come here and get some advice from the
community. I wrote a short proposal on the Wiki here:

https://github.com/pydata/numexpr/wiki/Numexpr-3.0-Branch-Overview

One can see my branch here:

https://github.com/robbmcleod/numexpr/tree/numexpr-3.0

If anyone has any comments they'd be welcome. Questions from my side for
the group:

1.) Numpy casting: I downloaded the Numpy source and after browsing it
seems the best approach is probably to just use
numpy.core.numerictypes.find_common_type?

2.) Can anyone foresee any issues with casting build-in Python types (i.e.
float and integer) to their OS dependent numpy equivalents? Numpy already
seems to do this.

3.) Is anyone enabling the Intel VML library? There are a number of
comments in the code that suggest it's not accelerating the code. It also
seems to cause problems with bundling numexpr with cx_freeze.

4.) I took a stab at converting from distutils to setuputils but this seems
challenging with numpy as a dependency. I wonder if anyone has tried
monkey-patching so that setup.py build_ext uses distutils and then pass the
interpreter.pyd/so as a data file, or some other such chicanery?

(I was going to ask about attaching a debugger, but I just noticed:
https://wiki.python.org/moin/DebuggingWithGdb   )

Ciao,

Robert

-- 
Robert McLeod, Ph.D.
Center for Cellular Imaging and Nano Analytics (C-CINA)
Biozentrum der Universität Basel
Mattenstrasse 26, 4058 Basel
Work: +41.061.387.3225
robert.mcl...@unibas.ch
robert.mcl...@bsse.ethz.ch 
robbmcl...@gmail.com
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Suggestion: special-case np.array(range(...)) to be faster

2016-02-14 Thread Ralf Gommers
On Sun, Feb 14, 2016 at 10:36 PM, Charles R Harris <
charlesr.har...@gmail.com> wrote:

>
>
> On Sun, Feb 14, 2016 at 7:36 AM, Ralf Gommers 
> wrote:
>
>>
>>
>> On Sun, Feb 14, 2016 at 9:21 AM, Antony Lee 
>> wrote:
>>
>>> re: no reason why...
>>> This has nothing to do with Python2/Python3 (I personally stopped using
>>> Python2 at least 3 years ago.)  Let me put it this way instead: if
>>> Python3's "range" (or Python2's "xrange") was not a builtin type but a type
>>> provided by numpy, I don't think it would be controversial at all to
>>> provide an `__array__` special method to efficiently convert it to a
>>> ndarray.  It would be the same if `np.array` used a
>>> `functools.singledispatch` dispatcher rather than an `__array__` special
>>> method (which is obviously not possible for chronological reasons).
>>>
>>> re: iterable vs iterator: check for the presence of the __next__ special
>>> method (or isinstance(x, Iterable) vs. isinstance(x, Iterator) and not
>>> isinstance(x, Iterable))
>>>
>>
>> I think it's good to do something about this, but it's not clear what the
>> exact proposal is. I could image one or both of:
>>
>>   - special-case the range() object in array (and asarray/asanyarray?)
>> such that array(range(N)) becomes as fast as arange(N).
>>   - special-case all iterators, such that array(range(N)) becomes as fast
>> as deque(range(N))
>>
>
> I think the last wouldn't help much, as numpy would still need to
> determine dimensions and type.  I assume that is one of the reason sparse
> itself doesn't do that.
>

Not orders of magnitude, but this shows that there's something to optimize
for iterators:

In [1]: %timeit np.array(range(10))
100 loops, best of 3: 14.9 ms per loop

In [2]: %timeit np.array(list(range(10)))
100 loops, best of 3: 9.68 ms per loop

Ralf
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Numexpr-3.0 proposal

2016-02-14 Thread Ralf Gommers
On Sun, Feb 14, 2016 at 11:19 PM, Robert McLeod 
wrote:

>
> 4.) I took a stab at converting from distutils to setuputils but this
> seems challenging with numpy as a dependency. I wonder if anyone has tried
> monkey-patching so that setup.py build_ext uses distutils and then pass the
> interpreter.pyd/so as a data file, or some other such chicanery?
>

Not sure what you mean, since numpexpr already uses setuptools:
https://github.com/pydata/numexpr/blob/master/setup.py#L22. What is the
real goal you're trying to achieve?

This monkeypatching is a bad idea:
https://github.com/robbmcleod/numexpr/blob/numexpr-3.0/setup.py#L19. Both
setuptools and numpy.distutils already do that, and that's already one too
many. So you definitely don't want to add a third place You can use the
-j (--parallel) flag to numpy.distutils instead, see
http://docs.scipy.org/doc/numpy-dev/user/building.html#parallel-builds

Ralf
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Suggestion: special-case np.array(range(...)) to be faster

2016-02-14 Thread Antony Lee
I wonder whether numpy is using the "old" iteration protocol (repeatedly
calling x[i] for increasing i until StopIteration is reached?)  A quick
timing shows that it is indeed slower.
... actually it's not even clear to me what qualifies as a sequence for
`np.array`:

class C:
def __iter__(self):
return iter(range(10)) # [0... 9] under the new iteration protocol
def __getitem__(self, i):
raise IndexError # [] under the old iteration protocol

np.array(C())
===> array(<__main__.C object at 0x7f3f2128>, dtype=object)


So how can np.array(range(...)) even work?

2016-02-14 22:21 GMT-08:00 Ralf Gommers :

>
>
> On Sun, Feb 14, 2016 at 10:36 PM, Charles R Harris <
> charlesr.har...@gmail.com> wrote:
>
>>
>>
>> On Sun, Feb 14, 2016 at 7:36 AM, Ralf Gommers 
>> wrote:
>>
>>>
>>>
>>> On Sun, Feb 14, 2016 at 9:21 AM, Antony Lee 
>>> wrote:
>>>
 re: no reason why...
 This has nothing to do with Python2/Python3 (I personally stopped using
 Python2 at least 3 years ago.)  Let me put it this way instead: if
 Python3's "range" (or Python2's "xrange") was not a builtin type but a type
 provided by numpy, I don't think it would be controversial at all to
 provide an `__array__` special method to efficiently convert it to a
 ndarray.  It would be the same if `np.array` used a
 `functools.singledispatch` dispatcher rather than an `__array__` special
 method (which is obviously not possible for chronological reasons).

 re: iterable vs iterator: check for the presence of the __next__
 special method (or isinstance(x, Iterable) vs. isinstance(x, Iterator) and
 not isinstance(x, Iterable))

>>>
>>> I think it's good to do something about this, but it's not clear what
>>> the exact proposal is. I could image one or both of:
>>>
>>>   - special-case the range() object in array (and asarray/asanyarray?)
>>> such that array(range(N)) becomes as fast as arange(N).
>>>   - special-case all iterators, such that array(range(N)) becomes as
>>> fast as deque(range(N))
>>>
>>
>> I think the last wouldn't help much, as numpy would still need to
>> determine dimensions and type.  I assume that is one of the reason sparse
>> itself doesn't do that.
>>
>
> Not orders of magnitude, but this shows that there's something to optimize
> for iterators:
>
> In [1]: %timeit np.array(range(10))
> 100 loops, best of 3: 14.9 ms per loop
>
> In [2]: %timeit np.array(list(range(10)))
> 100 loops, best of 3: 9.68 ms per loop
>
> Ralf
>
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion