Re: [Numpy-discussion] Suggestion: special-case np.array(range(...)) to be faster
re: no reason why... This has nothing to do with Python2/Python3 (I personally stopped using Python2 at least 3 years ago.) Let me put it this way instead: if Python3's "range" (or Python2's "xrange") was not a builtin type but a type provided by numpy, I don't think it would be controversial at all to provide an `__array__` special method to efficiently convert it to a ndarray. It would be the same if `np.array` used a `functools.singledispatch` dispatcher rather than an `__array__` special method (which is obviously not possible for chronological reasons). re: iterable vs iterator: check for the presence of the __next__ special method (or isinstance(x, Iterable) vs. isinstance(x, Iterator) and not isinstance(x, Iterable)) Antony 2016-02-13 18:48 GMT-08:00 : > > > On Sat, Feb 13, 2016 at 9:43 PM, wrote: > >> >> >> On Sat, Feb 13, 2016 at 8:57 PM, Antony Lee >> wrote: >> >>> Compare (on Python3 -- for Python2, read "xrange" instead of "range"): >>> >>> In [2]: %timeit np.array(range(100), np.int64) >>> 10 loops, best of 3: 156 ms per loop >>> >>> In [3]: %timeit np.arange(100, dtype=np.int64) >>> 1000 loops, best of 3: 853 µs per loop >>> >>> >>> Note that while iterating over a range is not very fast, it is still >>> much better than the array creation: >>> >>> In [4]: from collections import deque >>> >>> In [5]: %timeit deque(range(100), 1) >>> 10 loops, best of 3: 25.5 ms per loop >>> >>> >>> On one hand, special cases are awful. On the other hand, the range >>> builtin is probably important enough to deserve a special case to make this >>> construction faster. Or not? I initially opened this as >>> https://github.com/numpy/numpy/issues/7233 but it was suggested there >>> that this should be discussed on the ML first. >>> >>> (The real issue which prompted this suggestion: I was building sparse >>> matrices using scipy.sparse.csc_matrix with some indices specified using >>> range, and that construction step turned out to take a significant portion >>> of the time because of the calls to np.array). >>> >> >> >> IMO: I don't see a reason why this should be supported. There is >> np.arange after all for this usecase, and from_iter. >> range and the other guys are iterators, and in several cases we can use >> larange = list(range(...)) as a short cut to get python list.for python 2/3 >> compatibility. >> >> I think this might be partially a learning effect in the python 2 to 3 >> transition. After using almost only python 3 for maybe a year, I don't >> think it's difficult to remember the differences when writing code that is >> py 2.7 and py 3.x compatible. >> >> >> It's just **another** thing to watch out for if milliseconds matter in >> your application. >> > > > side question: Is there a simple way to distinguish a iterator or > generator from an iterable data structure? > > Josef > > > >> >> Josef >> >> >>> >>> Antony >>> >>> ___ >>> NumPy-Discussion mailing list >>> NumPy-Discussion@scipy.org >>> https://mail.scipy.org/mailman/listinfo/numpy-discussion >>> >>> >> > > ___ > NumPy-Discussion mailing list > NumPy-Discussion@scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] ANN: numpydoc 0.6.0 released
On Sat, Feb 13, 2016 at 10:38 PM, wrote: > > > On Sat, Feb 13, 2016 at 10:03 AM, Ralf Gommers > wrote: > >> Hi all, >> >> I'm pleased to announce the release of numpydoc 0.6.0. The main new >> feature is support for the Yields section in numpy-style docstrings. This >> is described in >> https://github.com/numpy/numpy/blob/master/doc/HOWTO_DOCUMENT.rst.txt >> >> Numpydoc can be installed from PyPi: >> https://pypi.python.org/pypi/numpydoc >> > > > Thanks, > > BTW: the status section in the howto still refers to the documentation > editor, which has been retired AFAIK. > Thanks Josef. I sent a PR to remove that text. Ralf ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Suggestion: special-case np.array(range(...)) to be faster
On Sun, Feb 14, 2016 at 3:21 AM, Antony Lee wrote: > re: no reason why... > This has nothing to do with Python2/Python3 (I personally stopped using > Python2 at least 3 years ago.) Let me put it this way instead: if > Python3's "range" (or Python2's "xrange") was not a builtin type but a type > provided by numpy, I don't think it would be controversial at all to > provide an `__array__` special method to efficiently convert it to a > ndarray. It would be the same if `np.array` used a > `functools.singledispatch` dispatcher rather than an `__array__` special > method (which is obviously not possible for chronological reasons). > > But numpy does provide arange. What's the reason to not use np.arange and use an iterator instead? > re: iterable vs iterator: check for the presence of the __next__ special > method (or isinstance(x, Iterable) vs. isinstance(x, Iterator) and not > isinstance(x, Iterable)) > AFAIR and from spot checking the mailing list, in the past the argument was that it's too complicated to mix array/asarray creation with fromiter building of arrays. (I have no idea if array could cheaply delegate to fromiter.) Josef > > Antony > > > 2016-02-13 18:48 GMT-08:00 : > >> >> >> On Sat, Feb 13, 2016 at 9:43 PM, wrote: >> >>> >>> >>> On Sat, Feb 13, 2016 at 8:57 PM, Antony Lee >>> wrote: >>> Compare (on Python3 -- for Python2, read "xrange" instead of "range"): In [2]: %timeit np.array(range(100), np.int64) 10 loops, best of 3: 156 ms per loop In [3]: %timeit np.arange(100, dtype=np.int64) 1000 loops, best of 3: 853 µs per loop Note that while iterating over a range is not very fast, it is still much better than the array creation: In [4]: from collections import deque In [5]: %timeit deque(range(100), 1) 10 loops, best of 3: 25.5 ms per loop On one hand, special cases are awful. On the other hand, the range builtin is probably important enough to deserve a special case to make this construction faster. Or not? I initially opened this as https://github.com/numpy/numpy/issues/7233 but it was suggested there that this should be discussed on the ML first. (The real issue which prompted this suggestion: I was building sparse matrices using scipy.sparse.csc_matrix with some indices specified using range, and that construction step turned out to take a significant portion of the time because of the calls to np.array). >>> >>> >>> IMO: I don't see a reason why this should be supported. There is >>> np.arange after all for this usecase, and from_iter. >>> range and the other guys are iterators, and in several cases we can use >>> larange = list(range(...)) as a short cut to get python list.for python 2/3 >>> compatibility. >>> >>> I think this might be partially a learning effect in the python 2 to 3 >>> transition. After using almost only python 3 for maybe a year, I don't >>> think it's difficult to remember the differences when writing code that is >>> py 2.7 and py 3.x compatible. >>> >>> >>> It's just **another** thing to watch out for if milliseconds matter in >>> your application. >>> >> >> >> side question: Is there a simple way to distinguish a iterator or >> generator from an iterable data structure? >> >> Josef >> >> >> >>> >>> Josef >>> >>> Antony ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion >>> >> >> ___ >> NumPy-Discussion mailing list >> NumPy-Discussion@scipy.org >> https://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> > > ___ > NumPy-Discussion mailing list > NumPy-Discussion@scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Suggestion: special-case np.array(range(...)) to be faster
On Sun, Feb 14, 2016 at 9:21 AM, Antony Lee wrote: > re: no reason why... > This has nothing to do with Python2/Python3 (I personally stopped using > Python2 at least 3 years ago.) Let me put it this way instead: if > Python3's "range" (or Python2's "xrange") was not a builtin type but a type > provided by numpy, I don't think it would be controversial at all to > provide an `__array__` special method to efficiently convert it to a > ndarray. It would be the same if `np.array` used a > `functools.singledispatch` dispatcher rather than an `__array__` special > method (which is obviously not possible for chronological reasons). > > re: iterable vs iterator: check for the presence of the __next__ special > method (or isinstance(x, Iterable) vs. isinstance(x, Iterator) and not > isinstance(x, Iterable)) > I think it's good to do something about this, but it's not clear what the exact proposal is. I could image one or both of: - special-case the range() object in array (and asarray/asanyarray?) such that array(range(N)) becomes as fast as arange(N). - special-case all iterators, such that array(range(N)) becomes as fast as deque(range(N)) or yet something else? Ralf ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Modulus (remainder) function corner cases
2016-02-13 17:42 GMT+01:00 Charles R Harris : > The Fortran modulo function, which is the same basic function as in my >> branch, does not specify any bounds on the result for floating numbers, but >> gives only the formula, modulus(a, b) = a - b*floor(a/b), which has the >> advantage of being simple and well defined ;) >> > > In the light of the libm-discussion I spent some time looking at floating point functions and their accuracy. I would vote in favor of keeping an implementation that uses the fmod-function of the system library and bends it to adhere to the python convention (sign of divisor). There is probably a reason why the fmod-implementation is not as simple as "a - b*floor(a/b)" [1]. One obvious problem with the simple expression arises when a/b = 0.0 in floating point. E.g. In [43]: np.__version__ Out[43]: '1.10.4' In [44]: x = np.float64(1e-320) In [45]: y = np.float64(-1e10) In [46]: x % y # this uses libm's fmod on my system Out[46]: -100.0 # == y, correctly rounded result in round-to-nearest In [47]: x - y*np.floor(x/y) # this here is the naive expression Out[47]: 9.9998886718268301e-321 # == x, wrong sign There are other problems (a/b = inf in floating point). As I do not understand the implementation of fmod (for example in openlibm) in detail I cannot give a full list of corner cases. Unfortunately, I did not follow the (many different) bug reports on this issue originally and am confused why there was a need to change the implementation in the first place. numpy's "%" operator seems to work quite well on my system. Therefore, this mail may be rather unproductive as I am missing some information. Concerning your original question: Many elementary functions loose their mathematical properties when they are calculated correctly-rounded in floating point numbers [2]. We do not fix this for other functions, I would not fix it here. Cheers Nils [1] https://github.com/JuliaLang/openlibm/blob/master/src/e_fmod.c [2] np.exp(np.nextafter(1.0, 0.0)) < np.e -> False (Monotonicity lost in exp(x)). ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Modulus (remainder) function corner cases
On Sun, Feb 14, 2016 at 12:35 PM, Nils Becker wrote: > 2016-02-13 17:42 GMT+01:00 Charles R Harris : > >> The Fortran modulo function, which is the same basic function as in my >>> branch, does not specify any bounds on the result for floating numbers, but >>> gives only the formula, modulus(a, b) = a - b*floor(a/b), which has >>> the advantage of being simple and well defined ;) >>> >> >> > In the light of the libm-discussion I spent some time looking at floating > point functions and their accuracy. I would vote in favor of keeping an > implementation that uses the fmod-function of the system library and bends > it to adhere to the python convention (sign of divisor). There is probably > a reason why the fmod-implementation is not as simple as "a - b*floor(a/b)" > [1]. > > One obvious problem with the simple expression arises when a/b = 0.0 in > floating point. E.g. > > In [43]: np.__version__ > Out[43]: '1.10.4' > In [44]: x = np.float64(1e-320) > In [45]: y = np.float64(-1e10) > In [46]: x % y # this uses libm's fmod on my system > Out[46]: -100.0 # == y, correctly rounded result in > round-to-nearest > In [47]: x - y*np.floor(x/y) # this here is the naive expression > Out[47]: 9.9998886718268301e-321 # == x, wrong sign > But more accurate ;) Currently, this is actually clipped. In [3]: remainder(x,y) Out[3]: -0.0 In [4]: x - y*floor(x/y) Out[4]: 9.9998886718268301e-321 In [5]: fmod(x,y) Out[5]: 9.9998886718268301e-321 > There are other problems (a/b = inf in floating point). As I do not > understand the implementation of fmod (for example in openlibm) in detail I > cannot give a full list of corner cases. > ? Chuck ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Modulus (remainder) function corner cases
On Sun, Feb 14, 2016 at 12:54 PM, Charles R Harris < charlesr.har...@gmail.com> wrote: > > > On Sun, Feb 14, 2016 at 12:35 PM, Nils Becker > wrote: > >> 2016-02-13 17:42 GMT+01:00 Charles R Harris : >> >>> The Fortran modulo function, which is the same basic function as in my branch, does not specify any bounds on the result for floating numbers, but gives only the formula, modulus(a, b) = a - b*floor(a/b), which has the advantage of being simple and well defined ;) >>> >>> >> In the light of the libm-discussion I spent some time looking at floating >> point functions and their accuracy. I would vote in favor of keeping an >> implementation that uses the fmod-function of the system library and bends >> it to adhere to the python convention (sign of divisor). There is probably >> a reason why the fmod-implementation is not as simple as "a - b*floor(a/b)" >> [1]. >> >> One obvious problem with the simple expression arises when a/b = 0.0 in >> floating point. E.g. >> >> In [43]: np.__version__ >> Out[43]: '1.10.4' >> In [44]: x = np.float64(1e-320) >> In [45]: y = np.float64(-1e10) >> In [46]: x % y # this uses libm's fmod on my system >> > I'm not too worried about denormals. However, this might be considered a bug in the floor function In [16]: floor(-1e-330) Out[16]: -0.0 Chuck ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Suggestion: special-case np.array(range(...)) to be faster
I was thinking (1) (special-case range()); however (2) may be more generally applicable and useful. Antony 2016-02-14 6:36 GMT-08:00 Ralf Gommers : > > > On Sun, Feb 14, 2016 at 9:21 AM, Antony Lee > wrote: > >> re: no reason why... >> This has nothing to do with Python2/Python3 (I personally stopped using >> Python2 at least 3 years ago.) Let me put it this way instead: if >> Python3's "range" (or Python2's "xrange") was not a builtin type but a type >> provided by numpy, I don't think it would be controversial at all to >> provide an `__array__` special method to efficiently convert it to a >> ndarray. It would be the same if `np.array` used a >> `functools.singledispatch` dispatcher rather than an `__array__` special >> method (which is obviously not possible for chronological reasons). >> >> re: iterable vs iterator: check for the presence of the __next__ special >> method (or isinstance(x, Iterable) vs. isinstance(x, Iterator) and not >> isinstance(x, Iterable)) >> > > I think it's good to do something about this, but it's not clear what the > exact proposal is. I could image one or both of: > > - special-case the range() object in array (and asarray/asanyarray?) > such that array(range(N)) becomes as fast as arange(N). > - special-case all iterators, such that array(range(N)) becomes as fast > as deque(range(N)) > > or yet something else? > > Ralf > > > > ___ > NumPy-Discussion mailing list > NumPy-Discussion@scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Modulus (remainder) function corner cases
On Sun, Feb 14, 2016 at 1:11 PM, Charles R Harris wrote: > > > On Sun, Feb 14, 2016 at 12:54 PM, Charles R Harris < > charlesr.har...@gmail.com> wrote: > >> >> >> On Sun, Feb 14, 2016 at 12:35 PM, Nils Becker >> wrote: >> >>> 2016-02-13 17:42 GMT+01:00 Charles R Harris : >>> The Fortran modulo function, which is the same basic function as in my > branch, does not specify any bounds on the result for floating > numbers, but gives only the formula, modulus(a, b) = a - b*floor(a/b), > which has the advantage of being simple and well defined ;) > >>> In the light of the libm-discussion I spent some time looking at >>> floating point functions and their accuracy. I would vote in favor of >>> keeping an implementation that uses the fmod-function of the system library >>> and bends it to adhere to the python convention (sign of divisor). There is >>> probably a reason why the fmod-implementation is not as simple as "a - >>> b*floor(a/b)" [1]. >>> >>> One obvious problem with the simple expression arises when a/b = 0.0 in >>> floating point. E.g. >>> >>> In [43]: np.__version__ >>> Out[43]: '1.10.4' >>> In [44]: x = np.float64(1e-320) >>> In [45]: y = np.float64(-1e10) >>> In [46]: x % y # this uses libm's fmod on my system >>> >> > I'm not too worried about denormals. However, this might be considered a > bug in the floor function > > In [16]: floor(-1e-330) > Out[16]: -0.0 > > However, I do note that some languages offer two versions of modulus, one floor based and the other trunc based (effectively fmod). What I wanted is to keep the remainder consistent with the floor function in the C library. Chuck ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Suggestion: special-case np.array(range(...)) to be faster
On Sun, Feb 14, 2016 at 7:36 AM, Ralf Gommers wrote: > > > On Sun, Feb 14, 2016 at 9:21 AM, Antony Lee > wrote: > >> re: no reason why... >> This has nothing to do with Python2/Python3 (I personally stopped using >> Python2 at least 3 years ago.) Let me put it this way instead: if >> Python3's "range" (or Python2's "xrange") was not a builtin type but a type >> provided by numpy, I don't think it would be controversial at all to >> provide an `__array__` special method to efficiently convert it to a >> ndarray. It would be the same if `np.array` used a >> `functools.singledispatch` dispatcher rather than an `__array__` special >> method (which is obviously not possible for chronological reasons). >> >> re: iterable vs iterator: check for the presence of the __next__ special >> method (or isinstance(x, Iterable) vs. isinstance(x, Iterator) and not >> isinstance(x, Iterable)) >> > > I think it's good to do something about this, but it's not clear what the > exact proposal is. I could image one or both of: > > - special-case the range() object in array (and asarray/asanyarray?) > such that array(range(N)) becomes as fast as arange(N). > - special-case all iterators, such that array(range(N)) becomes as fast > as deque(range(N)) > I think the last wouldn't help much, as numpy would still need to determine dimensions and type. I assume that is one of the reason sparse itself doesn't do that. Chuck ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] Numexpr-3.0 proposal
Hello everyone, I've done some work on making a new version of Numexpr that would fix some of the limitations of the original virtual machine with regards to data types and operation/function count. Basically I re-wrote the Python and C sides to use 4-byte words, instead of null-terminated strings, for operations and passing types. This means the number of operations and types isn't significantly limited anymore. Francesc Alted suggested I should come here and get some advice from the community. I wrote a short proposal on the Wiki here: https://github.com/pydata/numexpr/wiki/Numexpr-3.0-Branch-Overview One can see my branch here: https://github.com/robbmcleod/numexpr/tree/numexpr-3.0 If anyone has any comments they'd be welcome. Questions from my side for the group: 1.) Numpy casting: I downloaded the Numpy source and after browsing it seems the best approach is probably to just use numpy.core.numerictypes.find_common_type? 2.) Can anyone foresee any issues with casting build-in Python types (i.e. float and integer) to their OS dependent numpy equivalents? Numpy already seems to do this. 3.) Is anyone enabling the Intel VML library? There are a number of comments in the code that suggest it's not accelerating the code. It also seems to cause problems with bundling numexpr with cx_freeze. 4.) I took a stab at converting from distutils to setuputils but this seems challenging with numpy as a dependency. I wonder if anyone has tried monkey-patching so that setup.py build_ext uses distutils and then pass the interpreter.pyd/so as a data file, or some other such chicanery? (I was going to ask about attaching a debugger, but I just noticed: https://wiki.python.org/moin/DebuggingWithGdb ) Ciao, Robert -- Robert McLeod, Ph.D. Center for Cellular Imaging and Nano Analytics (C-CINA) Biozentrum der Universität Basel Mattenstrasse 26, 4058 Basel Work: +41.061.387.3225 robert.mcl...@unibas.ch robert.mcl...@bsse.ethz.ch robbmcl...@gmail.com ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Suggestion: special-case np.array(range(...)) to be faster
On Sun, Feb 14, 2016 at 10:36 PM, Charles R Harris < charlesr.har...@gmail.com> wrote: > > > On Sun, Feb 14, 2016 at 7:36 AM, Ralf Gommers > wrote: > >> >> >> On Sun, Feb 14, 2016 at 9:21 AM, Antony Lee >> wrote: >> >>> re: no reason why... >>> This has nothing to do with Python2/Python3 (I personally stopped using >>> Python2 at least 3 years ago.) Let me put it this way instead: if >>> Python3's "range" (or Python2's "xrange") was not a builtin type but a type >>> provided by numpy, I don't think it would be controversial at all to >>> provide an `__array__` special method to efficiently convert it to a >>> ndarray. It would be the same if `np.array` used a >>> `functools.singledispatch` dispatcher rather than an `__array__` special >>> method (which is obviously not possible for chronological reasons). >>> >>> re: iterable vs iterator: check for the presence of the __next__ special >>> method (or isinstance(x, Iterable) vs. isinstance(x, Iterator) and not >>> isinstance(x, Iterable)) >>> >> >> I think it's good to do something about this, but it's not clear what the >> exact proposal is. I could image one or both of: >> >> - special-case the range() object in array (and asarray/asanyarray?) >> such that array(range(N)) becomes as fast as arange(N). >> - special-case all iterators, such that array(range(N)) becomes as fast >> as deque(range(N)) >> > > I think the last wouldn't help much, as numpy would still need to > determine dimensions and type. I assume that is one of the reason sparse > itself doesn't do that. > Not orders of magnitude, but this shows that there's something to optimize for iterators: In [1]: %timeit np.array(range(10)) 100 loops, best of 3: 14.9 ms per loop In [2]: %timeit np.array(list(range(10))) 100 loops, best of 3: 9.68 ms per loop Ralf ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Numexpr-3.0 proposal
On Sun, Feb 14, 2016 at 11:19 PM, Robert McLeod wrote: > > 4.) I took a stab at converting from distutils to setuputils but this > seems challenging with numpy as a dependency. I wonder if anyone has tried > monkey-patching so that setup.py build_ext uses distutils and then pass the > interpreter.pyd/so as a data file, or some other such chicanery? > Not sure what you mean, since numpexpr already uses setuptools: https://github.com/pydata/numexpr/blob/master/setup.py#L22. What is the real goal you're trying to achieve? This monkeypatching is a bad idea: https://github.com/robbmcleod/numexpr/blob/numexpr-3.0/setup.py#L19. Both setuptools and numpy.distutils already do that, and that's already one too many. So you definitely don't want to add a third place You can use the -j (--parallel) flag to numpy.distutils instead, see http://docs.scipy.org/doc/numpy-dev/user/building.html#parallel-builds Ralf ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Suggestion: special-case np.array(range(...)) to be faster
I wonder whether numpy is using the "old" iteration protocol (repeatedly calling x[i] for increasing i until StopIteration is reached?) A quick timing shows that it is indeed slower. ... actually it's not even clear to me what qualifies as a sequence for `np.array`: class C: def __iter__(self): return iter(range(10)) # [0... 9] under the new iteration protocol def __getitem__(self, i): raise IndexError # [] under the old iteration protocol np.array(C()) ===> array(<__main__.C object at 0x7f3f2128>, dtype=object) So how can np.array(range(...)) even work? 2016-02-14 22:21 GMT-08:00 Ralf Gommers : > > > On Sun, Feb 14, 2016 at 10:36 PM, Charles R Harris < > charlesr.har...@gmail.com> wrote: > >> >> >> On Sun, Feb 14, 2016 at 7:36 AM, Ralf Gommers >> wrote: >> >>> >>> >>> On Sun, Feb 14, 2016 at 9:21 AM, Antony Lee >>> wrote: >>> re: no reason why... This has nothing to do with Python2/Python3 (I personally stopped using Python2 at least 3 years ago.) Let me put it this way instead: if Python3's "range" (or Python2's "xrange") was not a builtin type but a type provided by numpy, I don't think it would be controversial at all to provide an `__array__` special method to efficiently convert it to a ndarray. It would be the same if `np.array` used a `functools.singledispatch` dispatcher rather than an `__array__` special method (which is obviously not possible for chronological reasons). re: iterable vs iterator: check for the presence of the __next__ special method (or isinstance(x, Iterable) vs. isinstance(x, Iterator) and not isinstance(x, Iterable)) >>> >>> I think it's good to do something about this, but it's not clear what >>> the exact proposal is. I could image one or both of: >>> >>> - special-case the range() object in array (and asarray/asanyarray?) >>> such that array(range(N)) becomes as fast as arange(N). >>> - special-case all iterators, such that array(range(N)) becomes as >>> fast as deque(range(N)) >>> >> >> I think the last wouldn't help much, as numpy would still need to >> determine dimensions and type. I assume that is one of the reason sparse >> itself doesn't do that. >> > > Not orders of magnitude, but this shows that there's something to optimize > for iterators: > > In [1]: %timeit np.array(range(10)) > 100 loops, best of 3: 14.9 ms per loop > > In [2]: %timeit np.array(list(range(10))) > 100 loops, best of 3: 9.68 ms per loop > > Ralf > > > ___ > NumPy-Discussion mailing list > NumPy-Discussion@scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion