[Numpy-discussion] int32 twice in sctypes?
Hi, I was a bit confused by this on 32 bit linux: In [30]:sctypes['int'] Out[30]: [, , , , ] Is it easy to explain the two entries for int32 here? I notice there is only one int32 entry for the same test on my 64 bit system. Thanks a lot, Matthew ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] dtype attributes of scalar types
Hi, Would it be easy and / or sensible for - say - int32.itemsize to return the same as dtype(int32).itemsize, rather than the that it returns at the moment? Best, Matthew ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] atol in allclose
Hi, Sorry to keep cluttering the list, but I was a bit surprised by this behavior of allclose: In [25]:allclose([1.0], [1.0], rtol=0) Out[25]:True In [26]:allclose([1.0], [1.0], rtol=0, atol=0) Out[26]:False The docstring seems to imply that atol will not be used in this comparison - or did I misunderstand? Thanks, Matthew ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Latest Array-Interface PEP
Hi, > > wxPython -- Robin Dunn > > PIL -- Fredrik Lundh > > PyOpenGL -- Who? > > PyObjC -- would it be useful there? (Ronald Oussoren) > > MatplotLib (but maybe it's already married to numpy...) > > PyGtk ? > > > > It's a good start, but their is also > > PyMedia, PyVoxel, any video-library interface writers, any audo-library > interface writers. Is there already, or could there be, some sort of consortium of these that agree on the features in the PEP? Best, Matthew ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] Fwd: Numpy x Matlab: some synthetic benchmarks
Hi, I don't know if people remember this thread, but my memory was that there might be some interest in including numpy / matlab benchmark code somewhere in the distribution, to check on performance and allow matlab people to do a direct comparison. Is this still of interest? If so, what should be the next step? Thanks a lot, Matthew -- Forwarded message -- From: Paulo Jose da Silva e Silva <[EMAIL PROTECTED]> Date: Jan 18, 2006 11:44 AM Subject: [Numpy-discussion] Numpy x Matlab: some synthetic benchmarks To: numpy-discussion Hello, Travis asked me to benchmark numpy versus matlab in some basic linear algebra operations. Here are the resuts for matrices/vectors of dimensions 5, 50 and 500: Operation x'*yx*y'A*x A*B A'*xHalf2in2 Dimension 5 Array 0.940.7 0.220.281.120.981.1 Matrix 7.061.570.660.791.6 3.114.56 Matlab 1.880.440.410.350.371.2 0.98 Dimension 50 Array 9.743.090.5618.12 13.93 4.2 4.33 Matrix 81.99 3.811.0419.13 14.58 6.3 7.88 Matlab 16.98 1.941.0717.86 0.731.571.77 Dimension 500 Array 1.2 8.972.03166.59 20.34 3.994.31 Matrix 17.95 9.092.07166.62 20.67 4.114.45 Matlab 2.096.072.17169.45 2.1 2.563.06 Obs: The operation Half is actually A*x using only the lower half of the matrix and vector. The operation 2in2 is A*x using only the even indexes. Of course there are many repetitions of the same operation: 10 for dim 5 and 50 and 1000 for dim 500. The inner product is number of repetitions is multiplied by dimension (it is very fast). The software is numpy svn version 1926 Matlab 6.5.0.180913a Release 13 (Jun 18 2002) Both softwares are using the *same* BLAS and LAPACK (ATLAS for sse). As you can see, numpy array looks very competitive. The matrix class in numpy has too much overhead for small dimension though. This overhead is very small for medium size arrays. Looking at the results above (specially the small dimensions ones, for higher dimensions the main computations are being performed by the same BLAS) I believe we can say: 1) Numpy array is faster on usual operations but outerproduct (I believe the reason is that the dot function uses the regular matrix multiplication to compute outer-products, instead of using a special function. This can "easily" changes). In particular numpy was faster in matrix times vector operations, which is the most usual in numerical linear algebra. 2) Any operation that involves transpose suffers a very big penalty in numpy. Compare A'*x and A*x, it is 10 times slower. In contrast Matlab deals with transpose quite well. Travis is already aware of this and it can be probably solved. 3) When using subarrays, numpy is a slower. The difference seems acceptable. Travis, can this be improved? Best, Paulo Obs: Latter on (in a couple of days) I may present less synthetic benchmarks (a QR factorization and a Modified Cholesky). --- This SF.net email is sponsored by: Splunk Inc. Do you grep through log files for problems? Stop! Download the new AJAX search engine that makes searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642 ___ Numpy-discussion mailing list Numpy-discussion@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/numpy-discussion ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] [Matplotlib-users] [matplotlib-devel] Unifying numpy, scipy, and matplotlib docstring formats
Hi, > import plab > > plab.plot() #etc. > > and interactive use could do from plab import *. Yes... It's a hard call of course. I am a long term matlab user, and switched to python relatively recently. I do see the attraction of persuading people that you can get something very similar to matlab easily. The downside about making numpy / python like matlab is that you soon realize that you really have to think about your problems differently, and write code in a different way. I know that's obvious, but the variables as pointers, mutable / immutable types, zero based indexing, arrays vs matrices are all (fruitful) stumbling blocks. Then there is the very large change of thinking in an OO way, pulling in other large packages for doing other tasks, writing well-structured code with tests - all the features that python gives you for an industrial strength code base. And, the more pylab looks like matlab, the more surprised and confused people will be when they switch. So, I would argue that getting as close to matlab as possible should not be the unqualified goal here - it is a real change, with real pain, but great benefits. Best, Matthew ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] in place random generation
> > My problem is not space, but time. > > I am creating a small array over and over, > > and this is turning out to be a bottleneck. How about making one large random number array and taking small views? Matthew ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] in place random generation
Oh dear, sorry, I should have read your email more carefully. Matthew On 3/8/07, Daniel Mahler <[EMAIL PROTECTED]> wrote: > On 3/8/07, Matthew Brett <[EMAIL PROTECTED]> wrote: > > > > My problem is not space, but time. > > > > I am creating a small array over and over, > > > > and this is turning out to be a bottleneck. > > > > How about making one large random number array and taking small views? > > > > How is that different from: > > ++ > Allocating all the arrays as one n+1 dim and grabbing rows from it > is faster than allocating the small arrays individually. > I am iterating too many times to allocate everything at once though. > I can just do a nested loop > where create manageably large arrays in the outer > and grab the rows in the inner, > but I wanted something cleaner. > +++ > > later in the post > > > Matthew > > Daniel > > > ___ > > Numpy-discussion mailing list > > Numpy-discussion@scipy.org > > http://projects.scipy.org/mailman/listinfo/numpy-discussion > > > ___ > Numpy-discussion mailing list > Numpy-discussion@scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] matlab vs. python question
Hi, On 4/25/07, Neal Becker <[EMAIL PROTECTED]> wrote: > I'm interested in this comparison (not in starting yet another flame fest). > I actually know nothing about matlab, but almost all my peers use it. One > of the things I recall reading on this subject is that matlab doesn't > support OO style programming. I happened to look on the matlab vendor's > website, and found that it does have classes. OTOH, I've seen at least > some matlab code, and never saw anyone use these features. Actually, I think I know why you've never seen anyone use matlab objects, and this is because they are implemented in a rather strange way that makes them painful to use. I know because I've used them a lot myself, and I still have the scars, Matthew ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] matlab vs. python question
Well - these threads always go on for a long time, but... I've used matlab heavily for 10 years. I found that I had to use perl and C fairly heavily to get things done that matlab could not do well. Now I've switched to numpy, scipy, matplotlib, there is really nothing I miss in matlab. We would not attempt what we are doing now: http://neuroimaging.scipy.org/ in matlab - it's just not the right tool for a large scale programming effort. I agree that matlab has many attractions as a teaching tool and for small numeric processing scripts, but if you are writing a large to medium-sized application, I really don't think there is any comparison... Matthew ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] matlab vs. python question
Hi, > However, I would disagree that Python with all its tools going to > replace Matlab well for everything. For large projects, for advanced > programmers and for non-standard things such as complex database > handling (in my case) it is definitly a clear winner. However, I would > be weary of getting Matlab completely out of the picture because I > find it is still a much better tool for algorithm testing, scripting > and other "use for a week" functions . Thanks - that seems very fair. I think you're right that the level of help for matlab is far superior, and easier to get to, than numpy / python. Can you say anything more about the features of matlab that you miss for the 'use a week' functions? It might help guide our thoughts on potential improvements... Thanks a lot, Matthew ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] numpy endian question
On 4/27/07, Christopher Barker <[EMAIL PROTECTED]> wrote: > > Is there really no single method to call on an ndarray that asks: "what > endian are you" > > I know not every two-liner should be made into a convenience method, but > this seems like a good candidate to me. +1 I came across this source of minor confusion and excess code lines when writing the matlab io module for scipy. dtype.isbigendian? Matthew ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] [SciPy-user] Getting started wiki page
> I would very much link the Getting Started wiki page ( > http://scipy.org/Getting_Started ) to the front page. But I am not sure > it is of good enough quality so far. Could people please have a look and > make comments, or edit the page. Thank you for doing this. It's pitched very well. Matthew ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] Median / mean functionality confusing?
Hi, Does anyone else find this unexpected? In [93]: import numpy as N In [94]: a = N.arange(10).reshape(5,2) In [95]: N.mean(a) Out[95]: 4.5 In [96]: N.median(a) Out[96]: array([4, 5]) i.e. shouldn't median have the same axis, dtype, default axis=None behavior as mean? Best, Matthew ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Median / mean functionality confusing?
> Regarding dtype, I disagree. Why do you want to force the result to be a > float? Fair comment - I really meant the axis, and axis=None difference. Matthew ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Median / mean functionality confusing?
Can I resurrect this thread then by agreeing with Chris, and my original post, that it would be better if median had the same behavior as mean, accepting axis and dtype as inputs? Best, Matthew On 5/24/07, Christopher Barker <[EMAIL PROTECTED]> wrote: > Sven Schreiber wrote: > >> (Zar, Jerrold H. 1984. Biostatistical Analysis. Prentice Hall.) > > > > Is that the seminal work on the topic ;-) > > Of course not, just a reference I have handy -- though I suppose there > are any number of them on the web too. > > >> Of course, the median of an odd number of integers would be an integer. > > > that's why I asked about _forcing_ to a float > > To complete the discussion: > > >>> a = N.arange(4) > >>> type(N.median(a)) > > >>> a = N.arange(4) > >>> N.median(a) > 1.5 > >>> type(N.median(a)) > > >>> a = N.arange(5) > >>> N.median(a) > 2 > >>> type(N.median(a)) > > > So median converts to a float if it needs to, and keeps it an integer > otherwise, which seems reasonable to me, though it would be nice to > specify a dtype, so that you can make sure you always get a float if you > want one. > > -Chris > > > -- > Christopher Barker, Ph.D. > Oceanographer > > Emergency Response Division > NOAA/NOS/OR&R(206) 526-6959 voice > 7600 Sand Point Way NE (206) 526-6329 fax > Seattle, WA 98115 (206) 526-6317 main reception > > [EMAIL PROTECTED] > ___ > Numpy-discussion mailing list > Numpy-discussion@scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] byteswap() leaves dtype unchanged
Hi, > > This doesn't make any sense. The numbers have changed but the dtype > > is now incorrect. If you byteswap and correct the dtype the numbers > > have still changed, but you now can actually use the object. > > By "numbers have still changed" I mean the underlying byte order is > still different, but you can now use the object for mathematical > operations. My point is that the metadata for the object should be > correct after using it's built-in methods. I think the point is that you can have several different situations with byte ordering: 1) Your data and dtype endianess match, but you want the data swapped and the dtype to reflect this 2) Your data and dtype endianess don't match, and you want to swap the data so that they match the dtype 3) Your data and dtype endianess don't match, and you want to change the dtype so that it matches the data. I guess situation 1 is the one you have, and the way to deal with this is, as you say: other_order_arr = one_order_arr.byteswap() other_order_arr.dtype.newbyteorder() The byteswap method is obviously aimed at situation 2 (the data and dtype don't match in endianness, and you want to swap the data). I can see your point I think, that situation 1 seems to be the more common and obvious, and coming at it from outside, you would have thought that a.byteswap would change both. Best, Matthew ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] byteswap() leaves dtype unchanged
Hi, > > I can see your point I think, that situation 1 seems to be the more > > common and obvious, and coming at it from outside, you would have > > thought that a.byteswap would change both. > > I think the reason that byteswap behaves the way it does is that for > situation 1 you often don't actually need to do anything. Just > calculate with the things (it'll be a bit slow); as soon as the first > copy gets made you're back to native byte order. So for those times > you need to do it in place it's not too much trouble to byteswap and > adjust the byte order in the dtype (you'd need to inspect the byte > order in the first place to know it was byteswapped...) Thanks - good point. How about the following suggestion: For the next release: rename byteswap to something like byteswapbuffer deprecate byteswap in favor of byteswapbuffer Update the docstrings to make the distinction between situations clearer. I think that would reduce the clear element of surprise here. Best, Matthew ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] build advice
Hi, > That would get them all built as a cohesive set. Then I'd repeat the > installs without PYTHONPATH: Is that any different from: cd ~/src cd numpy python setup.py build cd ../scipy python setup.py build ... cd ../numpy python setup.py install cd ../scipy python setup.py install ? Just wondering - I don't know distutils well. Matthew ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] build advice
Ah, yes, I was typing too fast, thinking too little. On 5/31/07, John Hunter <[EMAIL PROTECTED]> wrote: > On 5/31/07, Matthew Brett <[EMAIL PROTECTED]> wrote: > > Hi, > > > > > That would get them all built as a cohesive set. Then I'd repeat the > > > installs without PYTHONPATH: > > > > Is that any different from: > > cd ~/src > >cd numpy > >python setup.py build > >cd ../scipy > >python setup.py build > > Well, the scipy and mpl builds need to see the new numpy build, I > think that is the issue. > JDH > ___ > Numpy-discussion mailing list > Numpy-discussion@scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] numpy arrays, data allocation and SIMD alignement
Hi, >Following an ongoing discussion with S. Johnson, one of the developer > of fftw3, I would be interested in what people think about adding > infrastructure in numpy related to SIMD alignement (that is 16 bytes > alignement for SSE/ALTIVEC, I don't know anything about other archs). > The problem is that right now, it is difficult to get information for > alignement in numpy (by alignement here, I mean something different than > what is normally meant in numpy context; whether, in my understanding, > NPY_ALIGNED refers to a pointer which is aligned wrt his type, here, I > am talking about arbitrary alignement). Excellent idea if practical... Matthew ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] .transpose() of memmap array fails to close()
Hi, Thanks for looking into this because we (neuroimaging.scipy.org) use mmaps a lot. I am very away from my desk at the moment but please do keep us all informed, and we'll try and pitch in if we can... Matthew On 8/15/07, Glen W. Mabey <[EMAIL PROTECTED]> wrote: > On Tue, Aug 14, 2007 at 12:23:26AM -0400, Anne Archibald wrote: > > On 13/08/07, Glen W. Mabey <[EMAIL PROTECTED]> wrote: > > > > > As I have tried to think through what should be the appropriate > > > behavior for the returned value of __getitem__, I have not been able to > > > see an appropriate solution (let alone know how to implement it) to this > > > issue. > > > > Is the problem one of finalization? That is, making sure the memory > > map gets (flushed and) closed exactly once? In this case the > > numpythonic solution is to have only the original mmap object do any > > finalization; any slices contain a reference to it anyway, so they > > cannot be kept after it is collected. If the problem is that you want > > to do an explicit close/flush on a slice object, you could just always > > apply the close/flush to the base object of the slice if it has one or > > the slice itself if it doesn't. > > The immediate problem is that when a numpy.memmap instance is created as > another view of the original array, then __del__ on that new view fails. > > flush()ing and closing aren't an issue for me, but they can't be > performed at all on derived views right now. It seems to me that any > derived view ought to be able to flush(), and ideally in my mind, > close() would be called [automatically] only just before the reference > count gets decremented to zero. > > That doesn't seem to match the numpythonic philosophy you described, > Anne, but seems logical to me, while still allowing for both manual > flush() and close() operations. > > Thanks for your response. > > Glen > ___ > Numpy-discussion mailing list > Numpy-discussion@scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion > ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] Error in allclose with inf values
Hi, I noticed that allclose does not always behave correctly for arrays with infs. I've attached a test script for allclose, and here's an alternative implementation that I believe behaves correctly. Obviously the test script could be a test case in core/tests/test_numeric.py I wonder if we should allow nans in the test arrays - possibly with an optional keyword arg like allownan. After all inf-inf is nan - but we allow that in the test. Best, Matthew from numpy.core import * def allclose(a, b, rtol=1.e-5, atol=1.e-8): """Returns True if all components of a and b are equal subject to given tolerances. The relative error rtol must be positive and << 1.0 The absolute error atol usually comes into play for those elements of b that are very small or zero; it says how small a must be also. """ x = array(a, copy=False) y = array(b, copy=False) xinf = isinf(x) if not all(xinf == isinf(y)): return False if not any(xinf): return all(less_equal(absolute(x-y), atol + rtol * absolute(y))) if not all(x[xinf] == y[xinf]): return False x = x[~xinf] y = y[~xinf] return all(less_equal(absolute(x-y), atol + rtol * absolute(y))) from numpy.core import * from numpy.testing import assert_array_equal rtol=1.e-5 atol=1.e-8 x = [1, 0] y = [1, 0] assert allclose(x, y) x = [inf, 0] assert not allclose(x, y) y = [1, inf] ''' Currently raises AssertionError ''' assert not allclose(x, y) x = [inf, inf] ''' Currently raises AssertionError ''' assert not allclose(x, y) y = [1, 0] ''' Currently raises AttributeError ''' assert not allclose(x, y) x = [-inf, 0] y = [inf, 0] assert not allclose(x, y) x = [nan, 0] y = [nan, 0] ''' At least worth a comment in the docstring, possibly override with optional kwarg, such as allownan=True ''' assert not allclose(x, y) x = [atol] y = [0] assert allclose(x, y) x = [atol*2] assert not allclose(x, y) x = [1] y = [1+rtol+atol] assert allclose(x, y) y = [1+rtol+atol*2] assert not allclose(x, y) x = array([100, 1000]) y = x + x*rtol assert allclose(x, y) assert allclose(y, x) ''' The following is somewhat surprising at first blush It is caused by use of y rather than x for scaling of rtol ''' y = y + atol*2 assert allclose(x, y) assert not allclose(y, x) ''' Mutliple dimensions ''' x = arange(125).reshape((5, 5, 5)) y = x + x*rtol assert allclose(y, x) ''' Adding atol puts the test very close to rounding error ''' y = y + atol*2 assert not allclose(y, x) ''' Test non-modification of input arrays ''' x = array([inf, 1]) y = array([0, inf]) assert not allclose(x, y) assert_array_equal(x, array([inf, 1])) assert_array_equal(y, array([0, inf])) ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Error in allclose with inf values
Hi again, > I noticed that allclose does not always behave correctly for arrays with infs. Sorry, perhaps I should have been more specific; this is the behavior of allclose that I was referring to (documented in the tests I attached): In [6]:N.allclose([N.inf, 1, 2], [10, 10, N.inf]) Out[6]:array([ True], dtype=bool) In [7]:N.allclose([N.inf, N.inf, N.inf], [10, 10, N.inf]) Warning: invalid value encountered in subtract Out[7]:True In [9]:N.allclose([N.inf, N.inf], [10, 10]) --- exceptions.AttributeErrorTraceback (most recent call last) /home/mb312/ /home/mb312/lib/python2.4/site-packages/numpy/core/numeric.py in allclose(a, b, rtol, atol) 843 d3 = (x[xinf] == y[yinf]) 844 d4 = (~xinf & ~yinf) --> 845 if d3.size < 2: 846 if d3.size==0: 847 return False AttributeError: 'bool' object has no attribute 'size' Matthew ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Bitting the bullet: using scons to build extensions inside distutils ?
Hi, >Starting thinking over the whole distutils thing, I was thinking > what people would think about using scons inside distutils to build > extension. In general this seems like an excellent idea. If we can contribute what we need to scons, that would greatly ease the burden of maintenance, and benefit both projects. The key problem will be support. At the moment Pearu maintains and owns numpy.distutils. Will we have the same level of commitment and support for this alternative do you think? How easy would it be to throw up a prototype for the rest of us to look at and get a feel for what the benefits would be? Matthew ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] adopting Python Style Guide for classes
On 10/2/07, Christopher Barker <[EMAIL PROTECTED]> wrote: > Jarrod Millman wrote: > > I am hoping that most of you agree with the general principle of > > bringing NumPy and SciPy into compliance with the standard naming > > conventions. Excellent plan - and I think it will make the code considerably more readable (and writeable). > > 3. When we release NumPy 1.1, we will convert all (or almost all) > > class names to CapWords. > > What's the backwards-compatible plan? > > - keep the old names as aliases? > - raise deprecation warnings? Both seem good. How about implementing both for the next minor release, with the ability to turn the deprecation warnings off? > What about factory functions that kind of look like they might be > classes -- numpy.array() comes to mind. Though maybe using CamelCase for > the real classes will help folks understand the difference. Sounds right to me - factory function as function, class as class. > What is a "class" in this case -- with new-style classes, there is no > distinction between types and classes, so I guess they are all classes, > which means lots of things like: > > numpy.float32 > > etc. etc. etc. are classes. should they be CamelCase too? I would vote for CamelCase in this case too. Matthew ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Memory leak in ndarray
Hi, > I seem to have tracked down a memory leak in the string conversion mechanism > of numpy. It is demonstrated using the following code: > > import numpy as np > > a = np.array([1.0, 2.0, 3.0]) > while True: > b = str(a) Would you not expect python rather than numpy to be dealing with the memory here though? Matthew ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Proposed Roadmap Overview
Hi, again (sorry), On Fri, Feb 17, 2012 at 10:18 PM, Christopher Jordan-Squire wrote: > On the broader topic of recruitment...sure, cython has a lower barrier > to entry than C++. But there are many, many more C++ developers and > resources out there than cython resources. And it likely will stay > that way for quite some time. On the other hand, in the current development community around numpy, and among the subscribers to this mailing list, I suspect there is more Cython experience than C++ experience. Of course it might be that so-far undiscovered C++ developers are drawn to a C++ rewrite of Numpy. But it that really likely? I can see a C++ developer being drawn to C++ performance library they would use in their C++ applications, but it's harder for me to imagine a C++ programmer being drawn to a Python library because the internals are C++. Best, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] The end of numpy as we know it ?
Hi, On Sat, Feb 18, 2012 at 9:06 AM, Dag Sverre Seljebotn wrote: > On 02/18/2012 08:52 AM, Benjamin Root wrote: >> >> >> On Saturday, February 18, 2012, Sturla Molden wrote: >> >> >> >> Den 18. feb. 2012 kl. 17:12 skrev Alan G Isaac > >: >> >> > >> > >> > How does "stream-lined" code written for maintainability >> > (i.e., with helpful comments and tests) become *less* >> > accessible to amateurs?? >> >> >> I think you missed the irony. >> >> Sturla >> >> >> Took me couple reads. Must be too early in the morning for me. >> >> For those who needs a clue, the last few lines seem to suggest that the >> only way forward is to relicense numpy so that it could be sold. This >> is obviously ridiculous and a give-away to the fact that everything else >> in the email was sarcastic. > > No, it was a quotation from Travis' blog: > > http://technicaldiscovery.blogspot.com/ Took me a couple of reads too. But I understand now, I think. I think Josef was indeed being ironic, and using the quote as part of the irony. > (I think people should just get a grip on themselves...worst case > scenario *ever* (and I highly doubt it) is a fork, and even that may > well be better than the status quo) This is nicely put, but very depressing. You say: "people should just get a grip on themselves" and we might also say: "shut up and stop whining". But, an environment like that is rich food for apathy, hostility and paranoia. Let's hope we're up to the challenge. Best, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Proposed Roadmap Overview
Hi. On Sat, Feb 18, 2012 at 12:18 AM, Christopher Jordan-Squire wrote: > On Fri, Feb 17, 2012 at 11:31 PM, Matthew Brett > wrote: >> Hi, >> >> On Fri, Feb 17, 2012 at 10:18 PM, Christopher Jordan-Squire >> wrote: >>> On Fri, Feb 17, 2012 at 8:30 PM, Sturla Molden wrote: >>>> >>>> >>>> Den 18. feb. 2012 kl. 05:01 skrev Jason Grout >>>> : >>>> >>>>> On 2/17/12 9:54 PM, Sturla Molden wrote: >>>>>> We would have to write a C++ programming tutorial that is based on Pyton >>>>>> knowledge instead of C knowledge. >>>>> >>>>> I personally would love such a thing. It's been a while since I did >>>>> anything nontrivial on my own in C++. >>>>> >>>> >>>> One example: How do we code multiple return values? >>>> >>>> In Python: >>>> - Return a tuple. >>>> >>>> In C: >>>> - Use pointers (evilness) >>>> >>>> In C++: >>>> - Return a std::tuple, as you would in Python. >>>> - Use references, as you would in Fortran or Pascal. >>>> - Use pointers, as you would in C. >>>> >>>> C++ textbooks always pick the last... >>>> >>>> I would show the first and the second method, and perhaps intentionally >>>> forget the last. >>>> >>>> Sturla >>>> >> >>> On the flip side, cython looked pretty...but I didn't get the >>> performance gains I wanted, and had to spend a lot of time figuring >>> out if it was cython, needing to add types, buggy support for numpy, >>> or actually the algorithm. >> >> At the time, was the numpy support buggy? I personally haven't had >> many problems with Cython and numpy. >> > > It's not that the support WAS buggy, it's that it wasn't clear to me > what was going on and where my performance bottleneck was. Even after > microbenchmarking with ipython, using timeit and prun, and using the > cython code visualization tool. Ultimately I don't think it was > cython, so perhaps my comment was a bit unfair. But it was > unfortunately difficult to verify that. Of course, as you say, > diagnosing and solving such issues would become easier to resolve with > more cython experience. > >>> The C files generated by cython were >>> enormous and difficult to read. They really weren't meant for human >>> consumption. >> >> Yes, it takes some practice to get used to what Cython will do, and >> how to optimize the output. >> >>> As Sturla has said, regardless of the quality of the >>> current product, it isn't stable. >> >> I've personally found it more or less rock solid. Could you say what >> you mean by "it isn't stable"? >> > > I just meant what Sturla said, nothing more: > > "Cython is still 0.16, it is still unfinished. We cannot base NumPy on > an unfinished compiler." Y'all mean, it has a zero at the beginning of the version number and it is still adding new features? Yes, that is correct, but it seems more reasonable to me to phrase that as 'active development' rather than 'unstable', because they take considerable care to be backwards compatible, have a large automated Cython test suite, and a major stress-tester in the Sage test suite. Best, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Proposed Roadmap Overview
Hi, On Sat, Feb 18, 2012 at 12:35 PM, Charles R Harris wrote: > > > On Sat, Feb 18, 2012 at 12:21 PM, Matthew Brett > wrote: >> >> Hi. >> >> On Sat, Feb 18, 2012 at 12:18 AM, Christopher Jordan-Squire >> wrote: >> > On Fri, Feb 17, 2012 at 11:31 PM, Matthew Brett >> > wrote: >> >> Hi, >> >> >> >> On Fri, Feb 17, 2012 at 10:18 PM, Christopher Jordan-Squire >> >> wrote: >> >>> On Fri, Feb 17, 2012 at 8:30 PM, Sturla Molden >> >>> wrote: >> >>>> >> >>>> >> >>>> Den 18. feb. 2012 kl. 05:01 skrev Jason Grout >> >>>> : >> >>>> >> >>>>> On 2/17/12 9:54 PM, Sturla Molden wrote: >> >>>>>> We would have to write a C++ programming tutorial that is based on >> >>>>>> Pyton knowledge instead of C knowledge. >> >>>>> >> >>>>> I personally would love such a thing. It's been a while since I did >> >>>>> anything nontrivial on my own in C++. >> >>>>> >> >>>> >> >>>> One example: How do we code multiple return values? >> >>>> >> >>>> In Python: >> >>>> - Return a tuple. >> >>>> >> >>>> In C: >> >>>> - Use pointers (evilness) >> >>>> >> >>>> In C++: >> >>>> - Return a std::tuple, as you would in Python. >> >>>> - Use references, as you would in Fortran or Pascal. >> >>>> - Use pointers, as you would in C. >> >>>> >> >>>> C++ textbooks always pick the last... >> >>>> >> >>>> I would show the first and the second method, and perhaps >> >>>> intentionally forget the last. >> >>>> >> >>>> Sturla >> >>>> >> >> >> >>> On the flip side, cython looked pretty...but I didn't get the >> >>> performance gains I wanted, and had to spend a lot of time figuring >> >>> out if it was cython, needing to add types, buggy support for numpy, >> >>> or actually the algorithm. >> >> >> >> At the time, was the numpy support buggy? I personally haven't had >> >> many problems with Cython and numpy. >> >> >> > >> > It's not that the support WAS buggy, it's that it wasn't clear to me >> > what was going on and where my performance bottleneck was. Even after >> > microbenchmarking with ipython, using timeit and prun, and using the >> > cython code visualization tool. Ultimately I don't think it was >> > cython, so perhaps my comment was a bit unfair. But it was >> > unfortunately difficult to verify that. Of course, as you say, >> > diagnosing and solving such issues would become easier to resolve with >> > more cython experience. >> > >> >>> The C files generated by cython were >> >>> enormous and difficult to read. They really weren't meant for human >> >>> consumption. >> >> >> >> Yes, it takes some practice to get used to what Cython will do, and >> >> how to optimize the output. >> >> >> >>> As Sturla has said, regardless of the quality of the >> >>> current product, it isn't stable. >> >> >> >> I've personally found it more or less rock solid. Could you say what >> >> you mean by "it isn't stable"? >> >> >> > >> > I just meant what Sturla said, nothing more: >> > >> > "Cython is still 0.16, it is still unfinished. We cannot base NumPy on >> > an unfinished compiler." >> >> Y'all mean, it has a zero at the beginning of the version number and >> it is still adding new features? Yes, that is correct, but it seems >> more reasonable to me to phrase that as 'active development' rather >> than 'unstable', because they take considerable care to be backwards >> compatible, have a large automated Cython test suite, and a major >> stress-tester in the Sage test suite. >> > > Matthew, > > No one in their right mind would build a large performance library using > Cython, it just isn't the right tool. For what it was designed for - > wrapping existing c code or writing small and simple things close to Python > - it does very well, but it was never designed for making core C/C++ > libraries and in that role it just gets in the way. I believe the proposal is to refactor the lowest levels in pure C and move the some or most of the library superstructure to Cython. Best, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Proposed Roadmap Overview
Hi, On Sat, Feb 18, 2012 at 12:45 PM, Charles R Harris wrote: > > > On Sat, Feb 18, 2012 at 1:39 PM, Matthew Brett > wrote: >> >> Hi, >> >> On Sat, Feb 18, 2012 at 12:35 PM, Charles R Harris >> wrote: >> > >> > >> > On Sat, Feb 18, 2012 at 12:21 PM, Matthew Brett >> > >> > wrote: >> >> >> >> Hi. >> >> >> >> On Sat, Feb 18, 2012 at 12:18 AM, Christopher Jordan-Squire >> >> wrote: >> >> > On Fri, Feb 17, 2012 at 11:31 PM, Matthew Brett >> >> > wrote: >> >> >> Hi, >> >> >> >> >> >> On Fri, Feb 17, 2012 at 10:18 PM, Christopher Jordan-Squire >> >> >> wrote: >> >> >>> On Fri, Feb 17, 2012 at 8:30 PM, Sturla Molden >> >> >>> wrote: >> >> >>>> >> >> >>>> >> >> >>>> Den 18. feb. 2012 kl. 05:01 skrev Jason Grout >> >> >>>> : >> >> >>>> >> >> >>>>> On 2/17/12 9:54 PM, Sturla Molden wrote: >> >> >>>>>> We would have to write a C++ programming tutorial that is based >> >> >>>>>> on >> >> >>>>>> Pyton knowledge instead of C knowledge. >> >> >>>>> >> >> >>>>> I personally would love such a thing. It's been a while since I >> >> >>>>> did >> >> >>>>> anything nontrivial on my own in C++. >> >> >>>>> >> >> >>>> >> >> >>>> One example: How do we code multiple return values? >> >> >>>> >> >> >>>> In Python: >> >> >>>> - Return a tuple. >> >> >>>> >> >> >>>> In C: >> >> >>>> - Use pointers (evilness) >> >> >>>> >> >> >>>> In C++: >> >> >>>> - Return a std::tuple, as you would in Python. >> >> >>>> - Use references, as you would in Fortran or Pascal. >> >> >>>> - Use pointers, as you would in C. >> >> >>>> >> >> >>>> C++ textbooks always pick the last... >> >> >>>> >> >> >>>> I would show the first and the second method, and perhaps >> >> >>>> intentionally forget the last. >> >> >>>> >> >> >>>> Sturla >> >> >>>> >> >> >> >> >> >>> On the flip side, cython looked pretty...but I didn't get the >> >> >>> performance gains I wanted, and had to spend a lot of time figuring >> >> >>> out if it was cython, needing to add types, buggy support for >> >> >>> numpy, >> >> >>> or actually the algorithm. >> >> >> >> >> >> At the time, was the numpy support buggy? I personally haven't had >> >> >> many problems with Cython and numpy. >> >> >> >> >> > >> >> > It's not that the support WAS buggy, it's that it wasn't clear to me >> >> > what was going on and where my performance bottleneck was. Even after >> >> > microbenchmarking with ipython, using timeit and prun, and using the >> >> > cython code visualization tool. Ultimately I don't think it was >> >> > cython, so perhaps my comment was a bit unfair. But it was >> >> > unfortunately difficult to verify that. Of course, as you say, >> >> > diagnosing and solving such issues would become easier to resolve >> >> > with >> >> > more cython experience. >> >> > >> >> >>> The C files generated by cython were >> >> >>> enormous and difficult to read. They really weren't meant for human >> >> >>> consumption. >> >> >> >> >> >> Yes, it takes some practice to get used to what Cython will do, and >> >> >> how to optimize the output. >> >> >> >> >> >>> As Sturla has said, regardless of the quality of the >> >> >>> current product, it isn't stable. >> >> >> >> >> >> I've personally found it more or less rock solid. Could you say >> >> >> what >> >> >> you mean by "it isn't stable"? >> >> >> >> >> > >> >> > I just meant what Sturla said, nothing more: >> >> > >> >> > "Cython is still 0.16, it is still unfinished. We cannot base NumPy >> >> > on >> >> > an unfinished compiler." >> >> >> >> Y'all mean, it has a zero at the beginning of the version number and >> >> it is still adding new features? Yes, that is correct, but it seems >> >> more reasonable to me to phrase that as 'active development' rather >> >> than 'unstable', because they take considerable care to be backwards >> >> compatible, have a large automated Cython test suite, and a major >> >> stress-tester in the Sage test suite. >> >> >> > >> > Matthew, >> > >> > No one in their right mind would build a large performance library using >> > Cython, it just isn't the right tool. For what it was designed for - >> > wrapping existing c code or writing small and simple things close to >> > Python >> > - it does very well, but it was never designed for making core C/C++ >> > libraries and in that role it just gets in the way. >> >> I believe the proposal is to refactor the lowest levels in pure C and >> move the some or most of the library superstructure to Cython. > > > Go for it. My goal was to try and contribute to substantive discussion of the benefits / costs of the various approaches. It does require a realistic assessment of what is being proposed. It may be, that discussion is not fruitful. But then we all lose, I think, Best, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Proposed Roadmap Overview
On Sat, Feb 18, 2012 at 1:40 PM, Charles R Harris wrote: > > > On Sat, Feb 18, 2012 at 2:17 PM, David Cournapeau > wrote: >> >> On Sat, Feb 18, 2012 at 8:45 PM, Charles R Harris >> wrote: >> > >> > >> > On Sat, Feb 18, 2012 at 1:39 PM, Matthew Brett >> > wrote: >> >> >> >> Hi, >> >> >> >> On Sat, Feb 18, 2012 at 12:35 PM, Charles R Harris >> >> wrote: >> >> > >> >> > >> >> > On Sat, Feb 18, 2012 at 12:21 PM, Matthew Brett >> >> > >> >> > wrote: >> >> >> >> >> >> Hi. >> >> >> >> >> >> On Sat, Feb 18, 2012 at 12:18 AM, Christopher Jordan-Squire >> >> >> wrote: >> >> >> > On Fri, Feb 17, 2012 at 11:31 PM, Matthew Brett >> >> >> > wrote: >> >> >> >> Hi, >> >> >> >> >> >> >> >> On Fri, Feb 17, 2012 at 10:18 PM, Christopher Jordan-Squire >> >> >> >> wrote: >> >> >> >>> On Fri, Feb 17, 2012 at 8:30 PM, Sturla Molden >> >> >> >>> >> >> >> >>> wrote: >> >> >> >>>> >> >> >> >>>> >> >> >> >>>> Den 18. feb. 2012 kl. 05:01 skrev Jason Grout >> >> >> >>>> : >> >> >> >>>> >> >> >> >>>>> On 2/17/12 9:54 PM, Sturla Molden wrote: >> >> >> >>>>>> We would have to write a C++ programming tutorial that is >> >> >> >>>>>> based >> >> >> >>>>>> on >> >> >> >>>>>> Pyton knowledge instead of C knowledge. >> >> >> >>>>> >> >> >> >>>>> I personally would love such a thing. It's been a while since >> >> >> >>>>> I >> >> >> >>>>> did >> >> >> >>>>> anything nontrivial on my own in C++. >> >> >> >>>>> >> >> >> >>>> >> >> >> >>>> One example: How do we code multiple return values? >> >> >> >>>> >> >> >> >>>> In Python: >> >> >> >>>> - Return a tuple. >> >> >> >>>> >> >> >> >>>> In C: >> >> >> >>>> - Use pointers (evilness) >> >> >> >>>> >> >> >> >>>> In C++: >> >> >> >>>> - Return a std::tuple, as you would in Python. >> >> >> >>>> - Use references, as you would in Fortran or Pascal. >> >> >> >>>> - Use pointers, as you would in C. >> >> >> >>>> >> >> >> >>>> C++ textbooks always pick the last... >> >> >> >>>> >> >> >> >>>> I would show the first and the second method, and perhaps >> >> >> >>>> intentionally forget the last. >> >> >> >>>> >> >> >> >>>> Sturla >> >> >> >>>> >> >> >> >> >> >> >> >>> On the flip side, cython looked pretty...but I didn't get the >> >> >> >>> performance gains I wanted, and had to spend a lot of time >> >> >> >>> figuring >> >> >> >>> out if it was cython, needing to add types, buggy support for >> >> >> >>> numpy, >> >> >> >>> or actually the algorithm. >> >> >> >> >> >> >> >> At the time, was the numpy support buggy? I personally haven't >> >> >> >> had >> >> >> >> many problems with Cython and numpy. >> >> >> >> >> >> >> > >> >> >> > It's not that the support WAS buggy, it's that it wasn't clear to >> >> >> > me >> >> >> > what was going on and where my performance bottleneck was. Even >> >> >> > after >> >> >> > microbenchmarking with ipython, using timeit and prun, and using >> >> >&g
Re: [Numpy-discussion] Proposed Roadmap Overview
Hi, On Sat, Feb 18, 2012 at 1:57 PM, Travis Oliphant wrote: > The C/C++ discussion is just getting started. Everyone should keep in mind > that this is not something that is going to happening quickly. This will > be a point of discussion throughout the year. I'm not a huge supporter of > C++, but C++11 does look like it's made some nice progress, and as I think > about making a core-set of NumPy into a library that can be called by > multiple languages (and even multiple implementations of Python), tempered > C++ seems like it might be an appropriate way to go. Could you say more about this? Do you have any idea when the decision about C++ is likely to be made? At what point does it make most sense to make the argument for or against? Can you suggest a good way for us to be able to make more substantial arguments either way? Can you say a little more about your impression of the previous Cython refactor and why it was not successful? Thanks a lot, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Proposed Roadmap Overview
Hi, On Sat, Feb 18, 2012 at 2:03 PM, Robert Kern wrote: > On Sat, Feb 18, 2012 at 21:51, Matthew Brett wrote: >> On Sat, Feb 18, 2012 at 1:40 PM, Charles R Harris >> wrote: >>> >>> >>> On Sat, Feb 18, 2012 at 2:17 PM, David Cournapeau >>> wrote: >>>> >>>> On Sat, Feb 18, 2012 at 8:45 PM, Charles R Harris >>>> wrote: >>>> > >>>> > >>>> > On Sat, Feb 18, 2012 at 1:39 PM, Matthew Brett >>>> > wrote: >>>> >> >>>> >> Hi, >>>> >> >>>> >> On Sat, Feb 18, 2012 at 12:35 PM, Charles R Harris >>>> >> wrote: >>>> >> > >>>> >> > >>>> >> > On Sat, Feb 18, 2012 at 12:21 PM, Matthew Brett >>>> >> > >>>> >> > wrote: >>>> >> >> >>>> >> >> Hi. >>>> >> >> >>>> >> >> On Sat, Feb 18, 2012 at 12:18 AM, Christopher Jordan-Squire >>>> >> >> wrote: >>>> >> >> > On Fri, Feb 17, 2012 at 11:31 PM, Matthew Brett >>>> >> >> > wrote: >>>> >> >> >> Hi, >>>> >> >> >> >>>> >> >> >> On Fri, Feb 17, 2012 at 10:18 PM, Christopher Jordan-Squire >>>> >> >> >> wrote: >>>> >> >> >>> On Fri, Feb 17, 2012 at 8:30 PM, Sturla Molden >>>> >> >> >>> >>>> >> >> >>> wrote: >>>> >> >> >>>> >>>> >> >> >>>> >>>> >> >> >>>> Den 18. feb. 2012 kl. 05:01 skrev Jason Grout >>>> >> >> >>>> : >>>> >> >> >>>> >>>> >> >> >>>>> On 2/17/12 9:54 PM, Sturla Molden wrote: >>>> >> >> >>>>>> We would have to write a C++ programming tutorial that is >>>> >> >> >>>>>> based >>>> >> >> >>>>>> on >>>> >> >> >>>>>> Pyton knowledge instead of C knowledge. >>>> >> >> >>>>> >>>> >> >> >>>>> I personally would love such a thing. It's been a while since >>>> >> >> >>>>> I >>>> >> >> >>>>> did >>>> >> >> >>>>> anything nontrivial on my own in C++. >>>> >> >> >>>>> >>>> >> >> >>>> >>>> >> >> >>>> One example: How do we code multiple return values? >>>> >> >> >>>> >>>> >> >> >>>> In Python: >>>> >> >> >>>> - Return a tuple. >>>> >> >> >>>> >>>> >> >> >>>> In C: >>>> >> >> >>>> - Use pointers (evilness) >>>> >> >> >>>> >>>> >> >> >>>> In C++: >>>> >> >> >>>> - Return a std::tuple, as you would in Python. >>>> >> >> >>>> - Use references, as you would in Fortran or Pascal. >>>> >> >> >>>> - Use pointers, as you would in C. >>>> >> >> >>>> >>>> >> >> >>>> C++ textbooks always pick the last... >>>> >> >> >>>> >>>> >> >> >>>> I would show the first and the second method, and perhaps >>>> >> >> >>>> intentionally forget the last. >>>> >> >> >>>> >>>> >> >> >>>> Sturla >>>> >> >> >>>> >>>> >> >> >> >>>> >> >> >>> On the flip side, cython looked pretty...but I didn't get the >>>> >> >> >>> performance gains I wanted, and had to spend a lot of time >>>> >> >> >>> figuring >>>> >> >> >>> out if it was cython, needing to add types, buggy support for >>>> >> >> >>> numpy, >>>> >> >>
Re: [Numpy-discussion] Proposed Roadmap Overview
Hi, On Sat, Feb 18, 2012 at 2:20 PM, Robert Kern wrote: > On Sat, Feb 18, 2012 at 22:06, Matthew Brett wrote: >> Hi, >> >> On Sat, Feb 18, 2012 at 2:03 PM, Robert Kern wrote: >>> On Sat, Feb 18, 2012 at 21:51, Matthew Brett >>> wrote: >>>> On Sat, Feb 18, 2012 at 1:40 PM, Charles R Harris >>>> wrote: >>>>> >>>>> >>>>> On Sat, Feb 18, 2012 at 2:17 PM, David Cournapeau >>>>> wrote: >>>>>> >>>>>> On Sat, Feb 18, 2012 at 8:45 PM, Charles R Harris >>>>>> wrote: >>>>>> > >>>>>> > >>>>>> > On Sat, Feb 18, 2012 at 1:39 PM, Matthew Brett >>>>>> > >>>>>> > wrote: >>>>>> >> >>>>>> >> Hi, >>>>>> >> >>>>>> >> On Sat, Feb 18, 2012 at 12:35 PM, Charles R Harris >>>>>> >> wrote: >>>>>> >> > >>>>>> >> > >>>>>> >> > On Sat, Feb 18, 2012 at 12:21 PM, Matthew Brett >>>>>> >> > >>>>>> >> > wrote: >>>>>> >> >> >>>>>> >> >> Hi. >>>>>> >> >> >>>>>> >> >> On Sat, Feb 18, 2012 at 12:18 AM, Christopher Jordan-Squire >>>>>> >> >> wrote: >>>>>> >> >> > On Fri, Feb 17, 2012 at 11:31 PM, Matthew Brett >>>>>> >> >> > wrote: >>>>>> >> >> >> Hi, >>>>>> >> >> >> >>>>>> >> >> >> On Fri, Feb 17, 2012 at 10:18 PM, Christopher Jordan-Squire >>>>>> >> >> >> wrote: >>>>>> >> >> >>> On Fri, Feb 17, 2012 at 8:30 PM, Sturla Molden >>>>>> >> >> >>> >>>>>> >> >> >>> wrote: >>>>>> >> >> >>>> >>>>>> >> >> >>>> >>>>>> >> >> >>>> Den 18. feb. 2012 kl. 05:01 skrev Jason Grout >>>>>> >> >> >>>> : >>>>>> >> >> >>>> >>>>>> >> >> >>>>> On 2/17/12 9:54 PM, Sturla Molden wrote: >>>>>> >> >> >>>>>> We would have to write a C++ programming tutorial that is >>>>>> >> >> >>>>>> based >>>>>> >> >> >>>>>> on >>>>>> >> >> >>>>>> Pyton knowledge instead of C knowledge. >>>>>> >> >> >>>>> >>>>>> >> >> >>>>> I personally would love such a thing. It's been a while >>>>>> >> >> >>>>> since >>>>>> >> >> >>>>> I >>>>>> >> >> >>>>> did >>>>>> >> >> >>>>> anything nontrivial on my own in C++. >>>>>> >> >> >>>>> >>>>>> >> >> >>>> >>>>>> >> >> >>>> One example: How do we code multiple return values? >>>>>> >> >> >>>> >>>>>> >> >> >>>> In Python: >>>>>> >> >> >>>> - Return a tuple. >>>>>> >> >> >>>> >>>>>> >> >> >>>> In C: >>>>>> >> >> >>>> - Use pointers (evilness) >>>>>> >> >> >>>> >>>>>> >> >> >>>> In C++: >>>>>> >> >> >>>> - Return a std::tuple, as you would in Python. >>>>>> >> >> >>>> - Use references, as you would in Fortran or Pascal. >>>>>> >> >> >>>> - Use pointers, as you would in C. >>>>>> >> >> >>>> >>>>>> >> >> >>>> C++ textbooks always pick the last... >>>>>> >> >> >>>> >>>>>> >> >> >>>
Re: [Numpy-discussion] Proposed Roadmap Overview
Hi, On Sat, Feb 18, 2012 at 2:51 PM, Robert Kern wrote: > On Sat, Feb 18, 2012 at 22:29, Matthew Brett wrote: >> Hi, >> >> On Sat, Feb 18, 2012 at 2:20 PM, Robert Kern wrote: >>> On Sat, Feb 18, 2012 at 22:06, Matthew Brett >>> wrote: >>>> Hi, >>>> >>>> On Sat, Feb 18, 2012 at 2:03 PM, Robert Kern wrote: > >>>>> Your misunderstanding of what was being discussed. The proposal being >>>>> discussed is implementing the core of numpy in C++, wrapped in C to be >>>>> usable as a C library that other extensions can use, and then exposed >>>>> to Python in an unspecified way. Cython was raised as an alternative >>>>> for this core, but as Chuck points out, it doesn't really fit. Your >>>>> assertion that what was being discussed was putting the core in C and >>>>> using Cython to wrap it was simply a non-sequitur. Discussion of >>>>> alternatives is fine. You weren't doing that. >>>> >>>> You read David's email? Was he also being annoying? >>> >>> Not really, because he was responding on-topic to the bizarro-branch >>> of the conversation that you spawned about the merits of moving from >>> hand-written C extensions to a Cython-wrapped C library. Whatever >>> annoyance his email might inspire is your fault, not his. The >>> discussion was about whether to use C++ or Cython for the core. Chuck >>> argued that Cython was not a suitable implementation language for the >>> core. You responded that his objections to Cython didn't apply to what >>> you thought was being discussed, using Cython to wrap a pure-C >>> library. As Pauli (Wolfgang, not our Pauli) once phrased it, you were >>> "not even wrong". It's hard to respond coherently to someone who is >>> breaking the fundamental expectations of discourse. Even I had to >>> stare at the thread for a few minutes to figure out where things went >>> off the rails. >> >> I'm sorry but this seems to me to be aggressive, offensive, and unjust. >> >> The discussion was, from the beginning, mainly about the relative >> benefits of rewriting the core with C / Cython, or C++. >> >> I don't think anyone was proposing writing every line of the numpy >> core in Cython. Ergo (sorry to use the debating term), the proposal >> to use Cython was always to take some of the higher level code out of >> C and leave some of it in C. It does indeed make the debate >> ridiculous to oppose a proposal that no-one has made. >> >> Now I am sure it is obvious to you, that the proposal to refactor the >> current C code to into low-level C libraries, and higher level Cython >> wrappers, is absurd and off the table. It isn't obvious to me. I >> don't think I broke a fundamental rule of polite discourse to clarify >> that is what I meant, > > It's not off the table, but it's not what this discussion was about. I beg to differ - which was why I replied the way I did. As I see it the two proposals being discussed were: 1) C++ rewrite of C core 2) Refactor current C core into C / Cython I think you can see from David's reply that that was also his understanding. Of course you could use Cython to interface to the 'core' in C or the 'core' in C++, but the difference would be, that some of the stuff in C++ for option 1) would be in Cython, in option 2). Now you might be saying, that you believe the discussion was only ever about whether the non-Cython bits would be in C or C++. That would indeed make sense of your lack of interest in discussion of Cython. I think you'd be hard pressed to claim it was only me discussing Cython though. Chuck was pointing out that it was completely ridiculous trying to implement the entire core in Cython. Yes it is. As no-one has proposed that, it seems to me only reasonable to point out what I meant, in the interests of productive discourse. Best, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Proposed Roadmap Overview
Hi, On Sat, Feb 18, 2012 at 2:54 PM, Travis Oliphant wrote: > > On Feb 18, 2012, at 4:03 PM, Matthew Brett wrote: > >> Hi, >> >> On Sat, Feb 18, 2012 at 1:57 PM, Travis Oliphant wrote: >>> The C/C++ discussion is just getting started. Everyone should keep in mind >>> that this is not something that is going to happening quickly. This will >>> be a point of discussion throughout the year. I'm not a huge supporter of >>> C++, but C++11 does look like it's made some nice progress, and as I think >>> about making a core-set of NumPy into a library that can be called by >>> multiple languages (and even multiple implementations of Python), tempered >>> C++ seems like it might be an appropriate way to go. >> >> Could you say more about this? Do you have any idea when the decision >> about C++ is likely to be made? At what point does it make most sense >> to make the argument for or against? Can you suggest a good way for >> us to be able to make more substantial arguments either way? > > I think early arguments against are always appropriate --- if you believe > they have a chance of swaying Mark or Chuck who are the strongest supporters > of C++ at this point. I will be quite nervous about going crazy with C++. > It was suggested that I use C++ 7 years ago when I wrote NumPy. I didn't > go that route then largely because of compiler issues, ABI-concerns, and I > knew C better than C++ so I felt like it would have taken me longer to do > something in C++. I made the right decision for me. If you think my > C-code is horrible, you would have been completely offended by whatever C++ I > might have done at the time. > > But I basically agree with Chuck that there is a lot of C-code in NumPy and > template-based-code that is really trying to be C++ spelled differently. > > The decision will not be made until NumPy 2.0 work is farther along. The > most likely outcome is that Mark will develop something quite nice in C++ > which he is already toying with, and we will either choose to use it in NumPy > to build 2.0 on --- or not. I'm interested in sponsoring Mark and working > as closely as I can with he and Chuck to see what emerges. Would it be fair to say then, that you are expecting the discussion about C++ will mainly arise after the Mark has written the code? I can see that it will be easier to specific at that point, but there must be a serious risk that it will be too late to seriously consider an alternative approach. >> Can you say a little more about your impression of the previous Cython >> refactor and why it was not successful? >> > > Sure. This list actually deserves a long writeup about that. First, there > wasn't a "Cython-refactor" of NumPy. There was a Cython-refactor of SciPy. > I'm not sure of it's current status. I'm still very supportive of that > sort of thing. I think I missed that - is it on git somewhere? > I don't know if Cython ever solved the "raising an exception in a > Fortran-called call-back" issue. I used setjmp and longjmp in several > places in SciPy originally in order to enable exceptions raised in a > Python-callback that is wrapped in a C-function pointer and being handed to a > Fortran-routine that asks for a function-pointer. > > What happend in NumPy, was that the code was re-factored to become a library. > I don't think much NumPy code actually ended up in Cython (the > random-number generators have been in Cython from the beginning). > > > The biggest problem with merging the code was that Mark Wiebe got active at > about that same time :-) He ended up changing several things in the > code-base that made it difficult to merge-in the changes. Some of the > bug-fixes and memory-leak patches, and tests did get into the code-base, but > the essential creation of the NumPy library did not make it. There was some > very good work done that I hope we can still take advantage of. > Another factor. the decision to make an extra layer of indirection makes > small arrays that much slower. I agree with Mark that in a core library we > need to go the other way with small arrays being completely allocated in the > data-structure itself (reducing the number of pointer de-references Does that imply there was a review of the refactor at some point to do things like benchmarking? Are there any sources to get started trying to understand the nature of the Numpy refactor and where it ran into trouble? Was it just the small arrays? > So, Cython did not play a major role on the NumPy side of things. It played > a very nice
Re: [Numpy-discussion] Proposed Roadmap Overview
On Sat, Feb 18, 2012 at 5:18 PM, Matthew Brett wrote: > Hi, > > On Sat, Feb 18, 2012 at 2:54 PM, Travis Oliphant wrote: >> >> On Feb 18, 2012, at 4:03 PM, Matthew Brett wrote: >> >>> Hi, >>> >>> On Sat, Feb 18, 2012 at 1:57 PM, Travis Oliphant >>> wrote: >>>> The C/C++ discussion is just getting started. Everyone should keep in mind >>>> that this is not something that is going to happening quickly. This will >>>> be a point of discussion throughout the year. I'm not a huge supporter >>>> of >>>> C++, but C++11 does look like it's made some nice progress, and as I think >>>> about making a core-set of NumPy into a library that can be called by >>>> multiple languages (and even multiple implementations of Python), tempered >>>> C++ seems like it might be an appropriate way to go. >>> >>> Could you say more about this? Do you have any idea when the decision >>> about C++ is likely to be made? At what point does it make most sense >>> to make the argument for or against? Can you suggest a good way for >>> us to be able to make more substantial arguments either way? >> >> I think early arguments against are always appropriate --- if you believe >> they have a chance of swaying Mark or Chuck who are the strongest supporters >> of C++ at this point. I will be quite nervous about going crazy with >> C++. It was suggested that I use C++ 7 years ago when I wrote NumPy. I >> didn't go that route then largely because of compiler issues, ABI-concerns, >> and I knew C better than C++ so I felt like it would have taken me longer to >> do something in C++. I made the right decision for me. If you think my >> C-code is horrible, you would have been completely offended by whatever C++ >> I might have done at the time. >> >> But I basically agree with Chuck that there is a lot of C-code in NumPy and >> template-based-code that is really trying to be C++ spelled differently. >> >> The decision will not be made until NumPy 2.0 work is farther along. The >> most likely outcome is that Mark will develop something quite nice in C++ >> which he is already toying with, and we will either choose to use it in >> NumPy to build 2.0 on --- or not. I'm interested in sponsoring Mark and >> working as closely as I can with he and Chuck to see what emerges. > > Would it be fair to say then, that you are expecting the discussion > about C++ will mainly arise after the Mark has written the code? I > can see that it will be easier to specific at that point, but there > must be a serious risk that it will be too late to seriously consider > an alternative approach. > >>> Can you say a little more about your impression of the previous Cython >>> refactor and why it was not successful? >>> >> >> Sure. This list actually deserves a long writeup about that. First, there >> wasn't a "Cython-refactor" of NumPy. There was a Cython-refactor of SciPy. >> I'm not sure of it's current status. I'm still very supportive of that >> sort of thing. > > I think I missed that - is it on git somewhere? > >> I don't know if Cython ever solved the "raising an exception in a >> Fortran-called call-back" issue. I used setjmp and longjmp in several >> places in SciPy originally in order to enable exceptions raised in a >> Python-callback that is wrapped in a C-function pointer and being handed to >> a Fortran-routine that asks for a function-pointer. >> >> What happend in NumPy, was that the code was re-factored to become a >> library. I don't think much NumPy code actually ended up in Cython (the >> random-number generators have been in Cython from the beginning). >> >> >> The biggest problem with merging the code was that Mark Wiebe got active at >> about that same time :-) He ended up changing several things in the >> code-base that made it difficult to merge-in the changes. Some of the >> bug-fixes and memory-leak patches, and tests did get into the code-base, but >> the essential creation of the NumPy library did not make it. There was >> some very good work done that I hope we can still take advantage of. > >> Another factor. the decision to make an extra layer of indirection makes >> small arrays that much slower. I agree with Mark that in a core library we >> need to go the other way with small arrays being completely allocated in the >> data-structure itself (reducing the n
Re: [Numpy-discussion] Proposed Roadmap Overview
Hi, On Sat, Feb 18, 2012 at 8:38 PM, Travis Oliphant wrote: > We will need to see examples of what Mark is talking about and clarify some > of the compiler issues. Certainly there is some risk that once code is > written that it will be tempting to just use it. Other approaches are > certainly worth exploring in the mean-time, but C++ has some strong > arguments for it. The worry as I understand it is that a C++ rewrite might make the numpy core effectively a read-only project for anyone but Mark. Do you have any feeling for whether that is likely? > I thought so, but I can't find it either. We should ask Jason McCampbell of > Enthought where the code is located. Here are the distributed eggs: > http://www.enthought.com/repo/.iron/ Should I email him? Happy to do that. > From my perspective having a standalone core NumPy is still a goal. The > primary advantages of having a NumPy library (call it NumLib for the sake of > argument) are > > 1) Ability for projects like PyPy, IronPython, and Jython to use it more > easily > 2) Ability for Ruby, Perl, Node.JS, and other new languages to use the code > for their technical computing projects. > 3) increasing the number of users who can help make it more solid > 4) being able to build the user-base (and corresponding performance with > eye-balls from Intel, NVidia, AMD, Microsoft, Google, etc. looking at the > code). > > The disadvantages I can think of: > 1) More users also means we might risk "lowest-commond-denominator" problems > --- i.e. trying to be too much to too many may make it not useful for > anyone. Also, more users means more people with opinions that might be > difficult to re-concile. > 2) The work of doing the re-write is not small: probably at least 6 > person-months > 3) Not being able to rely on Python objects (dictionaries, lists, and tuples > are currently used in the code-base quite a bit --- though the re-factor did > show some examples of how to remove this usage). > 4) Handling of "Object" arrays requires some re-design. How would numpylib compare to libraries like eigen? How likely do you think it would be that unrelated projects would use numpylib rather than eigen or other numerical libraries? Do you think the choice of C++ rather than C will influence whether other projects will take it up? See you, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Proposed Roadmap Overview
Hi, On Sat, Feb 18, 2012 at 9:47 PM, Benjamin Root wrote: > > > On Saturday, February 18, 2012, Matthew Brett wrote: >> >> Hi, >> >> On Sat, Feb 18, 2012 at 8:38 PM, Travis Oliphant >> wrote: >> >> > We will need to see examples of what Mark is talking about and clarify >> > some >> > of the compiler issues. Certainly there is some risk that once code is >> > written that it will be tempting to just use it. Other approaches are >> > certainly worth exploring in the mean-time, but C++ has some strong >> > arguments for it. >> >> The worry as I understand it is that a C++ rewrite might make the >> numpy core effectively a read-only project for anyone but Mark. Do >> you have any feeling for whether that is likely? >> > > Dude, have you seen the .c files in numpy/core? They are already read-only > for pretty much everybody but Mark. I think the question is whether refactoring in C would be preferable to refactoring in C++. > All kidding aside, is your concern that when Mark starts this that no one > will be able to contribute until he is done? I can tell you right now that > won't be the case as I will be trying to flesh out issues with datetime64 > with him. No - can I refer you back to the emails from David in particular about the difficulties of sharing development in C++? I can find the links - but do you remember the ones I'm referring to? See you, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Proposed Roadmap Overview
Hi, On Sat, Feb 18, 2012 at 10:09 PM, Charles R Harris wrote: > > > On Sat, Feb 18, 2012 at 9:38 PM, Travis Oliphant > wrote: >> >> Sure. This list actually deserves a long writeup about that. First, >> there wasn't a "Cython-refactor" of NumPy. There was a Cython-refactor of >> SciPy. I'm not sure of it's current status. I'm still very supportive of >> that sort of thing. >> >> >> I think I missed that - is it on git somewhere? >> >> >> I thought so, but I can't find it either. We should ask Jason McCampbell >> of Enthought where the code is located. Here are the distributed eggs: >> http://www.enthought.com/repo/.iron/ > > > Refactor is with the other numpy repos here. I think Travis is referring to the _scipy_ refactor here. I can't see that with the numpy repos, or with the scipy repos, but I may have missed it, See you, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] How a transition to C++ could work
Hi, Thanks for this - it's very helpful. On Sat, Feb 18, 2012 at 11:18 PM, Mark Wiebe wrote: > The suggestion of transitioning the NumPy core code from C to C++ has > sparked a vigorous debate, and I thought I'd start a new thread to give my > perspective on some of the issues raised, and describe how such a transition > could occur. > > First, I'd like to reiterate the gcc rationale for their choice to switch: > http://gcc.gnu.org/wiki/gcc-in-cxx#Rationale > > In particular, these points deserve emphasis: > > The C subset of C++ is just as efficient as C. > C++ supports cleaner code in several significant cases. > C++ makes it easier to write cleaner interfaces by making it harder to break > interface boundaries. > C++ never requires uglier code. > > Some people have pointed out that the Python templating preprocessor used in > NumPy is suggestive of C++ templates. A nice advantage of using C++ > templates instead of this preprocessor is that third party tools to improve > software quality, like static analysis tools, will be able to run directly > on the NumPy source code. Additionally, IDEs like XCode and Visual C++ will > be able to provide the full suite of tab-completion/intellisense features > that programmers working in those environments are accustomed to. > > There are concerns about ABI/API interoperability and interactions with C++ > exceptions. I've dealt with these types of issues on enough platforms to > know that while they're important, they're a lot easier to handle than the > issues with Fortran, BLAS, and LAPACK in SciPy. My experience has been that > providing a C API from a C++ library is no harder than providing a C API > from a C library. > > It's worth comparing the possibility of C++ versus the possibility of other > languages, and the ones that have been suggested for consideration are D, > Cython, Rust, Fortran 2003, Go, RPython, C# and Java. The target language > has to interact naturally with the CPython API. It needs to provide direct > access to all the various sizes of signed int, unsigned int, and float. It > needs to have mature compiler support wherever we want to deploy NumPy. > Taken together, these requirements eliminate a majority of these > possibilities. From these criteria, the only languages which seem to have a > clear possibility for the implementation of Numpy are C, C++, and D. On which criteria did you eliminate Cython? > The biggest question for any of these possibilities is how do you get the > code from its current state to a state which fully utilizes the target > language. C++, being nearly a superset of C, offers a strategy to gradually > absorb C++ features. Any of the other language choices requires a rewrite, > which would be quite disruptive. Because of all these reasons taken > together, I believe the only realistic language to use, other than sticking > with C, is C++. > > Finally, here's what I think is the best strategy for transitioning to C++. > First, let's consider what we do if 1.7 becomes an LTS release. > > 1) Immediately after branching for 1.7, we minimally patch all the .c files > so that they can build with a C++ compiler and with a C compiler at the same > time. Then we rename all .c -> .cpp, and update the build systems for C++. > 2) During the 1.8 development cycle, we heavily restrict C++ feature usage. > But, where a feature implementation would be arguably easier and less > error-prone with C++, we allow it. This is a period for learning about C++ > and how it can benefit NumPy. > 3) After the 1.8 release, the community will have developed more experience > with C++, and will be in a better position to discuss a way forward. > > If, for some reason, a 1.7 LTS is unacceptable, it might be a good idea to > restrict the 1.8 release to the subset of both C and C++. I would much > prefer using the 1.8 development cycle to dip our toes into the C++ world to > get some of the low-hanging benefits without doing anything disruptive. > > A really important point to emphasize is that C++ allows for a strategy > where we gradually evolve the codebase to better incorporate its language > features. This is what I'm advocating. No massive rewrite, no disruptive > changes. Gradual code evolution, with ABI and API compatibility comparable > to what we've delivered in 1.6 and the upcoming 1.7 releases. Do you have any comment on the need for coding standards when using C++? I saw the warning in: http://gcc.gnu.org/wiki/gcc-in-cxx#Rationale about using C++ unwisely. See you, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] How a transition to C++ could work
Hi, On Sun, Feb 19, 2012 at 12:49 AM, Mark Wiebe wrote: > On Sun, Feb 19, 2012 at 2:32 AM, Matthew Brett > wrote: >> >> Hi, >> >> Thanks for this - it's very helpful. >> >> On Sat, Feb 18, 2012 at 11:18 PM, Mark Wiebe wrote: >> > The suggestion of transitioning the NumPy core code from C to C++ has >> > sparked a vigorous debate, and I thought I'd start a new thread to give >> > my >> > perspective on some of the issues raised, and describe how such a >> > transition >> > could occur. >> > >> > First, I'd like to reiterate the gcc rationale for their choice to >> > switch: >> > http://gcc.gnu.org/wiki/gcc-in-cxx#Rationale >> > >> > In particular, these points deserve emphasis: >> > >> > The C subset of C++ is just as efficient as C. >> > C++ supports cleaner code in several significant cases. >> > C++ makes it easier to write cleaner interfaces by making it harder to >> > break >> > interface boundaries. >> > C++ never requires uglier code. >> > >> > Some people have pointed out that the Python templating preprocessor >> > used in >> > NumPy is suggestive of C++ templates. A nice advantage of using C++ >> > templates instead of this preprocessor is that third party tools to >> > improve >> > software quality, like static analysis tools, will be able to run >> > directly >> > on the NumPy source code. Additionally, IDEs like XCode and Visual C++ >> > will >> > be able to provide the full suite of tab-completion/intellisense >> > features >> > that programmers working in those environments are accustomed to. >> > >> > There are concerns about ABI/API interoperability and interactions with >> > C++ >> > exceptions. I've dealt with these types of issues on enough platforms to >> > know that while they're important, they're a lot easier to handle than >> > the >> > issues with Fortran, BLAS, and LAPACK in SciPy. My experience has been >> > that >> > providing a C API from a C++ library is no harder than providing a C API >> > from a C library. >> > >> > It's worth comparing the possibility of C++ versus the possibility of >> > other >> > languages, and the ones that have been suggested for consideration are >> > D, >> > Cython, Rust, Fortran 2003, Go, RPython, C# and Java. The target >> > language >> > has to interact naturally with the CPython API. It needs to provide >> > direct >> > access to all the various sizes of signed int, unsigned int, and float. >> > It >> > needs to have mature compiler support wherever we want to deploy NumPy. >> > Taken together, these requirements eliminate a majority of these >> > possibilities. From these criteria, the only languages which seem to >> > have a >> > clear possibility for the implementation of Numpy are C, C++, and D. >> >> On which criteria did you eliminate Cython? > > > The "mature compiler support" one. I took you to mean that the code would compile on any platform. As Cython compiles to C, I think Cython passes, if that is what you meant. Maybe you meant you thought that Cython was not mature in some sense, but if so, I'm not sure which sense you mean. > As glue between C/C++ and Python, it > looks great, but Dag's evaluation of Cython's maturity for implementing the > style of functionality in NumPy seems pretty authoritative. So people don't > have to dig through the giant email thread, here's the specific message > content from Dag, and it's context: > > On 02/18/2012 12:35 PM, Charles R Harris wrote: >> >> No one in their right mind would build a large performance library using >> Cython, it just isn't the right tool. For what it was designed for - >> wrapping existing c code or writing small and simple things close to >> Python - it does very well, but it was never designed for making core >> C/C++ libraries and in that role it just gets in the way. > > +1. Even I who have contributed to Cython realize this; last autumn I > implemented a library by writing it in C and wrapping it in Cython. As you probably saw, I think the proposal was indeed to use Cython to provide the higher-level parts of the core, while refactoring the rest of the C code underneath it. Obviously one could also refactor the C into C++, so the proposal to use Cython for some of the core is to some extent orthogonal to the choice of C / C++.I don't know the core, perhaps there isn't much of it that would benefit from being in Cython, I'd be interested to know your views. But, superficially, it seems like an attractive solution to making (some of) the core easier to maintain. Best, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Scipy Cython refactor
Hi, On Sun, Feb 19, 2012 at 7:35 AM, Pauli Virtanen wrote: > 19.02.2012 05:38, Travis Oliphant kirjoitti: > [clip] Sure. This list actually deserves a long writeup about that. First, there wasn't a "Cython-refactor" of NumPy. There was a Cython-refactor of SciPy. I'm not sure of it's current status. I'm still very supportive of that sort of thing. >>> >>> I think I missed that - is it on git somewhere? >> >> I thought so, but I can't find it either. We should ask Jason >> McCampbell of Enthought where the code is located. Here are the >> distributed eggs: http://www.enthought.com/repo/.iron/ > > They're here: > > https://github.com/dagss/private-scipy-refactor > https://github.com/jasonmccampbell/scipy-refactor > > The main problem with merging this was the experimental status of FWrap, > and the fact that the wrappers it generates are big compared to f2py and > required manual editing of the generated code. So, there were > maintainability concerns with the Fortran pieces. > > These could probably be solved, however, and I wouldn't be opposed to > e.g. cleaning up the generated code and using manually crafted Cython. > Cherry picking the Cython replacements for all the modules wrapped in C > probably should be done in any case. > > The parts of Scipy affected by the refactoring have not changed > significantly, so there are no significant problems in re-raising the > issue of merging the work back. Thanks for making a new thread. Who knows this work best? Who do you think should join the discussion to plan the work? I might have some time for this - maybe a sprint would be in order, Best, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] np.longlong casts to int
Hi, I was gaily using np.longlong for casting to the highest available float type when I noticed this: In [4]: np.array([2.1], dtype=np.longlong) Out[4]: array([2], dtype=int64) whereas: In [5]: np.array([2.1], dtype=np.float128) Out[5]: array([ 2.1], dtype=float128) This on OSX snow leopard numpies 1.2 .1 -> current devel and OSX tiger PPC recent devel. I had the impression that np.float128 and np.longlong would be identical in behavior - but I guess not? Best, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] np.longlong casts to int
2012/2/22 Stéfan van der Walt : > On Wed, Feb 22, 2012 at 2:47 PM, Matthew Brett > wrote: >> In [4]: np.array([2.1], dtype=np.longlong) >> Out[4]: array([2], dtype=int64) > > Maybe just a typo: > > In [3]: np.array([2.1], dtype=np.longfloat) > Out[3]: array([ 2.1], dtype=float128) A thinko maybe. Luckily I was in fact using longdouble in the live code, See you, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] np.longlong casts to int
Hi, On Thu, Feb 23, 2012 at 4:23 AM, Francesc Alted wrote: > On Feb 23, 2012, at 6:06 AM, Francesc Alted wrote: >> On Feb 23, 2012, at 5:43 AM, Nathaniel Smith wrote: >> >>> On Thu, Feb 23, 2012 at 11:40 AM, Francesc Alted >>> wrote: Exactly. I'd update this to read: float96 96 bits. Only available on 32-bit (i386) platforms. float128 128 bits. Only available on 64-bit (AMD64) platforms. >>> >>> Except float96 is actually 80 bits. (Usually?) Plus some padding… >> >> Good point. The thing is that they actually use 96 bit for storage purposes >> (this is due to alignment requirements). >> >> Another quirk related with this is that MSVC automatically maps long double >> to 64-bit doubles: >> >> http://msdn.microsoft.com/en-us/library/9cx8xs15.aspx >> >> Not sure on why they did that (portability issues?). > > Hmm, yet another quirk (this time in NumPy itself). On 32-bit platforms: > > In [16]: np.longdouble > Out[16]: numpy.float96 > > In [17]: np.finfo(np.longdouble).eps > Out[17]: 1.084202172485504434e-19 > > while on 64-bit ones: > > In [8]: np.longdouble > Out[8]: numpy.float128 > > In [9]: np.finfo(np.longdouble).eps > Out[9]: 1.084202172485504434e-19 > > i.e. NumPy is saying that the eps (machine epsilon) is the same on both > platforms, despite the fact that one uses 80-bit precision and the other > 128-bit precision. For the 80-bit, the eps should be (): > > In [5]: 1 / 2**63. > Out[5]: 1.0842021724855044e-19 > > [http://en.wikipedia.org/wiki/Extended_precision] > > which is correctly stated by NumPy, while for 128-bit (quad precision), eps > should be: > > In [6]: 1 / 2**113. > Out[6]: 9.62964972193618e-35 > > [http://en.wikipedia.org/wiki/Quadruple-precision_floating-point_format] > > If nobody objects, I'll file a bug about this. There was half a proposal for renaming these guys in the interests of clarity: http://mail.scipy.org/pipermail/numpy-discussion/2011-October/058820.html I'd be happy to write this up as a NEP. Best, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] np.longlong casts to int
Hi, On Thu, Feb 23, 2012 at 10:11 AM, Pierre Haessig wrote: > Le 23/02/2012 17:28, Charles R Harris a écrit : >> That's correct. They are both extended precision (80 bits), but >> aligned on 32bit/64bit boundaries respectively. Sun provides a true >> quad precision, also called float128, while on PPC long double is an >> odd combination of two doubles. > This is insane ! ;-) I don't know if it's insane, but it is certainly very confusing, as this thread the previous one show. The question is, what would be less confusing? Best, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] np.longlong casts to int
Hi, On Thu, Feb 23, 2012 at 10:45 AM, Mark Wiebe wrote: > On Thu, Feb 23, 2012 at 10:42 AM, Matthew Brett > wrote: >> >> Hi, >> >> On Thu, Feb 23, 2012 at 10:11 AM, Pierre Haessig >> wrote: >> > Le 23/02/2012 17:28, Charles R Harris a écrit : >> >> That's correct. They are both extended precision (80 bits), but >> >> aligned on 32bit/64bit boundaries respectively. Sun provides a true >> >> quad precision, also called float128, while on PPC long double is an >> >> odd combination of two doubles. >> > This is insane ! ;-) >> >> I don't know if it's insane, but it is certainly very confusing, as >> this thread the previous one show. >> >> The question is, what would be less confusing? > > > One approach would be to never alias longdouble as float###. Especially > float128 seems to imply that it's the IEEE standard binary128 float, which > it is on some platforms, but not on most. It's virtually never IEEE binary128. Yarik Halchenko found a real one on an s/360 running Debian. Some docs seem to suggest there are Sun machines out there with binary128, as Chuck said. So the vast majority of numpy users with float128 have Intel 80-bit, and some have PPC twin-float. Do we all agree then that 'float128' is a bad name? In the last thread, I had the feeling there was some consensus on renaming Intel 80s to: float128 -> float80_128 float96 -> float80_96 For those platforms implementing it, maybe float128 -> float128_ieee Maybe for PPC: float128 -> float_pair_128 and, personally, I still think it would be preferable, and less confusing, to encourage use of 'longdouble' instead of the various platform specific aliases. What do you think? Best, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] np.longlong casts to int
Hi, On Thu, Feb 23, 2012 at 2:56 PM, Pierre Haessig wrote: > Le 23/02/2012 20:08, Mark Wiebe a écrit : >> +1, I think it's good for its name to correspond to the name in C/C++, >> so that when people search for information on it they will find the >> relevant information more easily. With a bunch of NumPy-specific >> aliases, it just creates more hassle for everybody. > I don't fully agree. > > First, this assumes that people were "C-educated", at least a bit. I got > some C education, but I spent most of my scientific programming time > sitting in front of Python, Matlab, and a bit of R (in that order). In > this context, double, floats, long and short are all esoteric incantation. > Second the C/C++ names are very unprecise with regards to their memory > content, and sometimes platform dependent. On the other "float64" is > very informative. Right - no proposal to change float64 because it's not ambiguous - it is both binary64 IEEE floating point format and 64 bit width. The confusion here is for float128 - which is very occasionally IEEE binary128 and can be at least two other things (PPC twin double, and Intel 80 bit padded to 128 bits). Some of us were also surprised to find float96 is the same precision as float128 (being an 80 bit Intel padded to 96 bits). The renaming is an attempt to make it less confusing. Do you agree the renaming is less confusing? Do you have another proposal? Preferring 'longdouble' is precisely to flag up to people that they may need to do some more research to find out what exactly that is. Which is correct :) Best, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Possible roadmap addendum: building better text file readers
Hi, On Mon, Feb 27, 2012 at 2:43 PM, Alan G Isaac wrote: > On 2/27/2012 2:28 PM, Pauli Virtanen wrote: >> ISO specifies comma to be used in international standards >> (ISO/IEC Directives, part 2 / 6.6.8.1): >> >> http://isotc.iso.org/livelink/livelink?func=ll&objId=10562502&objAction=download > > > I do not think you are right. > I think that is a presentational requirement: > rules of presentation for documents that > are intended to become international standards. > Note as well the requirement of spacing to > separate digits. Clearly this cannot be a data > storage specification. > > Naturally, the important thing is to agree on a > standard data representation. Which one it is > is less important, especially if conversion tools > will be supplied. > > But it really is past time for the scientific community > to insist on one international standard, and the > decimal point has privilege of place because of > computing language conventions. (Being the standard > in the two largest economies in the world is a > different kind of argument in favor of this choice.) Maybe we can just agree it is an important option to have rather than an unimportant one, Best, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Possible roadmap addendum: building better text file readers
Hi, On Mon, Feb 27, 2012 at 2:58 PM, Pauli Virtanen wrote: > Hi, > > 27.02.2012 20:43, Alan G Isaac kirjoitti: >> On 2/27/2012 2:28 PM, Pauli Virtanen wrote: >>> ISO specifies comma to be used in international standards >>> (ISO/IEC Directives, part 2 / 6.6.8.1): >>> >>> http://isotc.iso.org/livelink/livelink?func=ll&objId=10562502&objAction=download >> >> I do not think you are right. >> I think that is a presentational requirement: >> rules of presentation for documents that >> are intended to become international standards. > > Yes, it's an requirement for the standard texts themselves, but not what > the standard texts specify. Which is why I didn't think it was so > relevant (but the wikipedia link just prompted an immediate [citation > needed]). I agree that using something else than '.' does not make much > sense. I suppose if anyone out there is from a country that uses commas for decimals in CSV files and does not want to have to convert them before reading them will be keen to volunteer to help with the coding. I am certainly glad it is not my own case, Best, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] [Numpy] quadruple precision
Hi, On Wed, Feb 29, 2012 at 12:13 PM, Jonathan Rocher wrote: > Thanks to your question, I discovered that there is a float128 dtype in > numpy > > In[5]: np.__version__ > Out[5]: '1.6.1' > > In[6]: np.float128? > Type: type > Base Class: > String Form: > Namespace: Interactive > File: > /Library/Frameworks/Python.framework/Versions/7.2/lib/python2.7/site-packages/numpy/__init__.py > Docstring: > 128-bit floating-point number. Character code: 'g'. C long float > compatible. Right - but remember that numpy float128 is different on different platforms. In particular, float128 is any C longdouble type that needs 128 bits of memory, regardless of precision or implementation. See [1] for background on C longdouble type. The numpy platforms I know about are: Intel : 80 bit float padded to 128 bits [2] PPC : pair of float64 values [3] Debian IBM s390 : real quadruple precision [4] [5] I see that some Sun machines implement real quadruple precision in software but I haven't run numpy on a Sun machine [6] [1] http://en.wikipedia.org/wiki/Long_double [2] http://en.wikipedia.org/wiki/Extended_precision#x86_Architecture_Extended_Precision_Format [3] http://en.wikipedia.org/wiki/Double-double_%28arithmetic%29#Double-double_arithmetic [4] http://en.wikipedia.org/wiki/Double-double_%28arithmetic%29#IEEE_754_quadruple-precision_binary_floating-point_format:_binary128 [5] https://github.com/nipy/nibabel/issues/76 [6] http://en.wikipedia.org/wiki/Double-double_%28arithmetic%29#Implementations > Based on some reported issues, it seems like there are issues though with > this and its mapping to python long integer... > http://mail.scipy.org/pipermail/numpy-discussion/2011-October/058784.html I tried to summarize the problems I knew about here: http://mail.scipy.org/pipermail/numpy-discussion/2011-November/059087.html There are some routines to deal with some of the problems here: https://github.com/nipy/nibabel/blob/master/nibabel/casting.py After spending some time with the various longdoubles in numpy, I have learned to stare at my code for a long time considering how it might run into the various problems above. Best, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Proposed Roadmap Overview
Hi, On Wed, Feb 29, 2012 at 1:46 AM, Travis Oliphant wrote: > We already use the NEP process for such decisions. This discussion came > from simply from the *idea* of writing such a NEP. > > Nothing has been decided. Only opinions have been shared that might > influence the NEP. This is all pretty premature, though --- migration to > C++ features on a trial branch is some months away were it to happen. Fernando can correct me if I'm wrong, but I think he was asking a governance question. That is: would you (as BDF$N) consider the following guideline: "As a condition for accepting significant changes to Numpy, for each significant change, there will be a NEP. The NEP shall follow the same model as the Python PEPs - that is - there will be a summary of the changes, the issues arising, the for / against opinions and alternatives offered. There will usually be a draft implementation. The NEP will contain the resolution of the discussion as it relates to the code" For example, the masked array NEP, although very substantial, contains little discussion of the controversy arising, or the intended resolution of the controversy: https://github.com/numpy/numpy/blob/3f685a1a990f7b6e5149c80b52436fb4207e49f5/doc/neps/missing-data.rst I mean, although it is useful, it is not in the form of a PEP, as Fernando has described it. Would you accept extending the guidelines to the NEP format? Best, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] Bus error for Debian / SPARC on current trunk
Hi, Sorry that this report is not complete, I don't have full access to this box but, on a Debian squeeze machine running linux 2.6.32-5-sparc64-smp: nosetests ~/usr/local/lib/python2.6/site-packages/numpy/lib/tests/test_io.py:TestFromTxt.test_user_missing_values test_user_missing_values (test_io.TestFromTxt) ... Bus error This on current master : 1.7.0.dev-b9872b4 Cheers, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Bus error for Debian / SPARC on current trunk
Hi, On Fri, Mar 2, 2012 at 9:05 PM, Charles R Harris wrote: > > > On Fri, Mar 2, 2012 at 4:36 PM, Matthew Brett > wrote: >> >> Hi, >> >> Sorry that this report is not complete, I don't have full access to >> this box but, on a Debian squeeze machine running linux >> 2.6.32-5-sparc64-smp: >> >> nosetests >> ~/usr/local/lib/python2.6/site-packages/numpy/lib/tests/test_io.py:TestFromTxt.test_user_missing_values >> >> test_user_missing_values (test_io.TestFromTxt) ... Bus error >> >> This on current master : 1.7.0.dev-b9872b4 >> > > Hmm, some tests might have been recently enabled. Any chance of doing a > bisection? I'm on it - will get back to you tomorrow. See you, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Bus error for Debian / SPARC on current trunk
Hi, On Sat, Mar 3, 2012 at 12:07 AM, Matthew Brett wrote: > Hi, > > On Fri, Mar 2, 2012 at 9:05 PM, Charles R Harris > wrote: >> >> >> On Fri, Mar 2, 2012 at 4:36 PM, Matthew Brett >> wrote: >>> >>> Hi, >>> >>> Sorry that this report is not complete, I don't have full access to >>> this box but, on a Debian squeeze machine running linux >>> 2.6.32-5-sparc64-smp: >>> >>> nosetests >>> ~/usr/local/lib/python2.6/site-packages/numpy/lib/tests/test_io.py:TestFromTxt.test_user_missing_values >>> >>> test_user_missing_values (test_io.TestFromTxt) ... Bus error >>> >>> This on current master : 1.7.0.dev-b9872b4 >>> >> >> Hmm, some tests might have been recently enabled. Any chance of doing a >> bisection? Struggling because compilation is very slow and there are lots of untestable commits. df907e6 is the first known bad. Here's the output from a log: * df907e6 - (HEAD, refs/bisect/bad) BLD: Failure in single file build mode because of a static function in two separate files (6 months ago) [Mark Wiebe] * 01b200b - (refs/bisect/skip-01b200b10149312f51234448e44b230b1b548046) BUG: nditer: The nditer was reusing the reduce loop inappropriately (#1938) (6 months ago) [Mark Wiebe] * f45fd67 - (refs/bisect/skip-f45fd67fe8eefc8fd2e4b914ab4e376ab5226887) DOC: Small tweak to release notes (6 months ago) [Mark Wiebe] * 73be11d - (refs/bisect/skip-73be11db794d115a7d9bd2e822c0d8008bc14a28) BUG: Some bugs in squeeze and concatenate found by testing SciPy (6 months ago) [Mark Wiebe] * c873295 - (refs/bisect/skip-c8732958c8e07f2306029dfde2178faf9c01d049) TST: missingdata: Finish up NA mask tests for np.std and np.var (6 months ago) [Mark Wiebe] * e15712c - (refs/bisect/skip-e15712cf5df41806980f040606744040a433b331) BUG: nditer: NA masks in arrays with leading 1 dimensions had an issue (6 months ago) [Mark Wiebe] * ded81ae - (refs/bisect/skip-ded81ae7d529ac0fba641b7e5e3ecf52e120700f) ENH: missingdata: Implement tests for np.std, add skipna= and keepdims= parameters to more functions (6 months ago) [Mark Wiebe] * a112fc4 - (refs/bisect/skip-a112fc4a6b28fbb85e1b0c6d423095d13cf7b226) ENH: missingdata: Implement skipna= support for np.std and np.var (6 months ago) [Mark Wiebe] * 0fa4f22 - (refs/bisect/skip-0fa4f22fec4b19e2a8c1d93e5a1f955167c9addd) ENH: missingdata: Support 'skipna=' parameter in np.mean (6 months ago) [Mark Wiebe] * bfda229 - (refs/bisect/skip-bfda229ec93d37b1ee2cdd8b9443ec4e34536bbf) ENH: missingdata: Create count_reduce_items function (6 months ago) [Mark Wiebe] * d9b3f90 - (refs/bisect/skip-d9b3f90de3213ece9a78b77088fdec17910e81d9) ENH: missingdata: Move the Reduce boilerplate into a function PyArray_ReduceWrapper (6 months ago) [Mark Wiebe] * 67ece6b - (refs/bisect/skip-67ece6bdd2b35d011893e78154dbff6ab51c7d35) ENH: missingdata: Finish count_nonzero as a full-fledged reduction operation (6 months ago) [Mark Wiebe] * 6bfd819 - (refs/bisect/skip-6bfd819a0897caf6e6db244930c40ed0d17b9e62) ENH: missingdata: Towards making count_nonzero a full-featured reduction operation (6 months ago) [Mark Wiebe] * a1faa1b - (refs/bisect/skip-a1faa1b6883c47333508a0476c1304b0a8a3f64e) ENH: missingdata: Move some of the refactored reduction code into the API (6 months ago) [Mark Wiebe] * f597374 - (refs/bisect/skip-f597374edc298810083799e8539c99fc0a93b319) ENH: missingdata: Change default to create NA-mask when NAs are in lists (6 months ago) [Mark Wiebe] * 965e4cf - (refs/bisect/skip-965e4cff5c4c50e8ff051a3363adc6cf6aa640cd) ENH: missingdata: trying some more functions to see how they treat NAs (6 months ago) [Mark Wiebe] * b1cb211 - (refs/bisect/skip-b1cb211d159c617ee4ebd16266d6f1042417ef75) ENH: missingdata: Add nastr= parameter to np.set_printoptions() (6 months ago) [Mark Wiebe] * ba4d116 - (refs/bisect/skip-ba4d1161fe4943cb720f35c0abfd0581628255d6) BUG: missingdata: Fix mask usage in PyArray_TakeFrom, add tests for it (6 months ago) [Mark Wiebe] * a3a0ee8 - (refs/bisect/skip-a3a0ee8c72fdd55ffacb96bbb1fa9c3569cfb3e9) BUG: missingdata: The ndmin parameter to np.array wasn't respecting NA masks (6 months ago) [Mark Wiebe] * 9194b3a - (refs/bisect/skip-9194b3af704df71aa9b1ff2f53f169848d0f9dc7) ENH: missingdata: Rewrite PyArray_Concatenate to work with NA masks (6 months ago) [Mark Wiebe] * 99a21ef - (refs/bisect/good-99a21efff4b1f2292dc370c7c9c7c58f10385f2a) ENH: missingdata: Add NA support to np.diagonal, change np.diagonal to always return a view (6 months ago) [Mark Wiebe] So - the problem arises somewhere between 99a21ef (good) and df907e6 (bad) There seems to be a compilation error for the skipped commits - here's the one I tested, 9194b3a: gcc: numpy/core/src/multiarray/multiarraymodule_onefile.c In file included from numpy/core/src/multiarray/scalartypes.c.src:25, from numpy/core/src/multi
Re: [Numpy-discussion] Bus error for Debian / SPARC on current trunk
Hi, On Sun, Mar 4, 2012 at 11:41 AM, Mark Wiebe wrote: > On Sun, Mar 4, 2012 at 11:27 AM, Matthew Brett > wrote: >> >> Hi, >> >> On Sat, Mar 3, 2012 at 12:07 AM, Matthew Brett >> wrote: >> > Hi, >> > >> > On Fri, Mar 2, 2012 at 9:05 PM, Charles R Harris >> > wrote: >> >> >> >> >> >> On Fri, Mar 2, 2012 at 4:36 PM, Matthew Brett >> >> wrote: >> >>> >> >>> Hi, >> >>> >> >>> Sorry that this report is not complete, I don't have full access to >> >>> this box but, on a Debian squeeze machine running linux >> >>> 2.6.32-5-sparc64-smp: >> >>> >> >>> nosetests >> >>> >> >>> ~/usr/local/lib/python2.6/site-packages/numpy/lib/tests/test_io.py:TestFromTxt.test_user_missing_values >> >>> >> >>> test_user_missing_values (test_io.TestFromTxt) ... Bus error >> >>> >> >>> This on current master : 1.7.0.dev-b9872b4 >> >>> >> >> >> >> Hmm, some tests might have been recently enabled. Any chance of doing a >> >> bisection? >> >> Struggling because compilation is very slow and there are lots of >> untestable commits. df907e6 is the first known bad. Here's the >> output from a log: >> >> * df907e6 - (HEAD, refs/bisect/bad) BLD: Failure in single file build >> mode because of a static function in two separate files (6 months ago) >> [Mark Wiebe] >> * 01b200b - (refs/bisect/skip-01b200b10149312f51234448e44b230b1b548046) >> BUG: nditer: The nditer was reusing the reduce loop inappropriately >> (#1938) (6 months ago) [Mark Wiebe] >> * f45fd67 - (refs/bisect/skip-f45fd67fe8eefc8fd2e4b914ab4e376ab5226887) >> DOC: Small tweak to release notes (6 months ago) [Mark Wiebe] >> * 73be11d - (refs/bisect/skip-73be11db794d115a7d9bd2e822c0d8008bc14a28) >> BUG: Some bugs in squeeze and concatenate found by testing SciPy (6 >> months ago) [Mark Wiebe] >> * c873295 - (refs/bisect/skip-c8732958c8e07f2306029dfde2178faf9c01d049) >> TST: missingdata: Finish up NA mask tests for np.std and np.var (6 >> months ago) [Mark Wiebe] >> * e15712c - (refs/bisect/skip-e15712cf5df41806980f040606744040a433b331) >> BUG: nditer: NA masks in arrays with leading 1 dimensions had an issue >> (6 months ago) [Mark Wiebe] >> * ded81ae - (refs/bisect/skip-ded81ae7d529ac0fba641b7e5e3ecf52e120700f) >> ENH: missingdata: Implement tests for np.std, add skipna= and >> keepdims= parameters to more functions (6 months ago) [Mark Wiebe] >> * a112fc4 - (refs/bisect/skip-a112fc4a6b28fbb85e1b0c6d423095d13cf7b226) >> ENH: missingdata: Implement skipna= support for np.std and np.var (6 >> months ago) [Mark Wiebe] >> * 0fa4f22 - (refs/bisect/skip-0fa4f22fec4b19e2a8c1d93e5a1f955167c9addd) >> ENH: missingdata: Support 'skipna=' parameter in np.mean (6 months >> ago) [Mark Wiebe] >> * bfda229 - (refs/bisect/skip-bfda229ec93d37b1ee2cdd8b9443ec4e34536bbf) >> ENH: missingdata: Create count_reduce_items function (6 months ago) >> [Mark Wiebe] >> * d9b3f90 - (refs/bisect/skip-d9b3f90de3213ece9a78b77088fdec17910e81d9) >> ENH: missingdata: Move the Reduce boilerplate into a function >> PyArray_ReduceWrapper (6 months ago) [Mark Wiebe] >> * 67ece6b - (refs/bisect/skip-67ece6bdd2b35d011893e78154dbff6ab51c7d35) >> ENH: missingdata: Finish count_nonzero as a full-fledged reduction >> operation (6 months ago) [Mark Wiebe] >> * 6bfd819 - (refs/bisect/skip-6bfd819a0897caf6e6db244930c40ed0d17b9e62) >> ENH: missingdata: Towards making count_nonzero a full-featured >> reduction operation (6 months ago) [Mark Wiebe] >> * a1faa1b - (refs/bisect/skip-a1faa1b6883c47333508a0476c1304b0a8a3f64e) >> ENH: missingdata: Move some of the refactored reduction code into the >> API (6 months ago) [Mark Wiebe] >> * f597374 - (refs/bisect/skip-f597374edc298810083799e8539c99fc0a93b319) >> ENH: missingdata: Change default to create NA-mask when NAs are in >> lists (6 months ago) [Mark Wiebe] >> * 965e4cf - (refs/bisect/skip-965e4cff5c4c50e8ff051a3363adc6cf6aa640cd) >> ENH: missingdata: trying some more functions to see how they treat NAs >> (6 months ago) [Mark Wiebe] >> * b1cb211 - (refs/bisect/skip-b1cb211d159c617ee4ebd16266d6f1042417ef75) >> ENH: missingdata: Add nastr= parameter to np.set_printoptions() (6 >> months ago) [Mark Wiebe] >> * ba4d116 - (refs/bisect/skip-ba4d1161fe4943cb720f35c0abfd0581628255d6) >> BUG: missingdata: Fix mask usage in PyArray_TakeFrom, add tests for
Re: [Numpy-discussion] Bus error for Debian / SPARC on current trunk
Hi, On Sun, Mar 4, 2012 at 8:32 PM, Mark Wiebe wrote: > On Sun, Mar 4, 2012 at 10:08 PM, Matthew Brett > wrote: >> >> Hi, >> >> On Sun, Mar 4, 2012 at 11:41 AM, Mark Wiebe wrote: >> > On Sun, Mar 4, 2012 at 11:27 AM, Matthew Brett >> > wrote: >> >> >> >> Hi, >> >> >> >> On Sat, Mar 3, 2012 at 12:07 AM, Matthew Brett >> >> >> >> wrote: >> >> > Hi, >> >> > >> >> > On Fri, Mar 2, 2012 at 9:05 PM, Charles R Harris >> >> > wrote: >> >> >> >> >> >> >> >> >> On Fri, Mar 2, 2012 at 4:36 PM, Matthew Brett >> >> >> >> >> >> wrote: >> >> >>> >> >> >>> Hi, >> >> >>> >> >> >>> Sorry that this report is not complete, I don't have full access to >> >> >>> this box but, on a Debian squeeze machine running linux >> >> >>> 2.6.32-5-sparc64-smp: >> >> >>> >> >> >>> nosetests >> >> >>> >> >> >>> >> >> >>> ~/usr/local/lib/python2.6/site-packages/numpy/lib/tests/test_io.py:TestFromTxt.test_user_missing_values >> >> >>> >> >> >>> test_user_missing_values (test_io.TestFromTxt) ... Bus error >> >> >>> >> >> >>> This on current master : 1.7.0.dev-b9872b4 >> >> >>> >> >> >> >> >> >> Hmm, some tests might have been recently enabled. Any chance of >> >> >> doing a >> >> >> bisection? >> >> >> >> Struggling because compilation is very slow and there are lots of >> >> untestable commits. df907e6 is the first known bad. Here's the >> >> output from a log: >> >> >> >> * df907e6 - (HEAD, refs/bisect/bad) BLD: Failure in single file build >> >> mode because of a static function in two separate files (6 months ago) >> >> [Mark Wiebe] >> >> * 01b200b - (refs/bisect/skip-01b200b10149312f51234448e44b230b1b548046) >> >> BUG: nditer: The nditer was reusing the reduce loop inappropriately >> >> (#1938) (6 months ago) [Mark Wiebe] >> >> * f45fd67 - (refs/bisect/skip-f45fd67fe8eefc8fd2e4b914ab4e376ab5226887) >> >> DOC: Small tweak to release notes (6 months ago) [Mark Wiebe] >> >> * 73be11d - (refs/bisect/skip-73be11db794d115a7d9bd2e822c0d8008bc14a28) >> >> BUG: Some bugs in squeeze and concatenate found by testing SciPy (6 >> >> months ago) [Mark Wiebe] >> >> * c873295 - (refs/bisect/skip-c8732958c8e07f2306029dfde2178faf9c01d049) >> >> TST: missingdata: Finish up NA mask tests for np.std and np.var (6 >> >> months ago) [Mark Wiebe] >> >> * e15712c - (refs/bisect/skip-e15712cf5df41806980f040606744040a433b331) >> >> BUG: nditer: NA masks in arrays with leading 1 dimensions had an issue >> >> (6 months ago) [Mark Wiebe] >> >> * ded81ae - (refs/bisect/skip-ded81ae7d529ac0fba641b7e5e3ecf52e120700f) >> >> ENH: missingdata: Implement tests for np.std, add skipna= and >> >> keepdims= parameters to more functions (6 months ago) [Mark Wiebe] >> >> * a112fc4 - (refs/bisect/skip-a112fc4a6b28fbb85e1b0c6d423095d13cf7b226) >> >> ENH: missingdata: Implement skipna= support for np.std and np.var (6 >> >> months ago) [Mark Wiebe] >> >> * 0fa4f22 - (refs/bisect/skip-0fa4f22fec4b19e2a8c1d93e5a1f955167c9addd) >> >> ENH: missingdata: Support 'skipna=' parameter in np.mean (6 months >> >> ago) [Mark Wiebe] >> >> * bfda229 - (refs/bisect/skip-bfda229ec93d37b1ee2cdd8b9443ec4e34536bbf) >> >> ENH: missingdata: Create count_reduce_items function (6 months ago) >> >> [Mark Wiebe] >> >> * d9b3f90 - (refs/bisect/skip-d9b3f90de3213ece9a78b77088fdec17910e81d9) >> >> ENH: missingdata: Move the Reduce boilerplate into a function >> >> PyArray_ReduceWrapper (6 months ago) [Mark Wiebe] >> >> * 67ece6b - (refs/bisect/skip-67ece6bdd2b35d011893e78154dbff6ab51c7d35) >> >> ENH: missingdata: Finish count_nonzero as a full-fledged reduction >> >> operation (6 months ago) [Mark Wiebe] >> >> * 6bfd819 - (refs/bisect/skip-6bfd819a0897caf6e6db244930c40ed0d17b9e62) >> >> ENH: missingdata: Towards making count_nonzero a full-featured >> >> reduction operation (6 months ago) [Mark Wiebe]
Re: [Numpy-discussion] Bus error for Debian / SPARC on current trunk
Hi, On Sun, Mar 4, 2012 at 11:53 PM, Mark Wiebe wrote: > On Sun, Mar 4, 2012 at 10:34 PM, Matthew Brett > wrote: >> >> >> > $ export NPY_SEPARATE_COMPILATION=1 >> >> Thanks, that did it: >> >> 9194b3af704df71aa9b1ff2f53f169848d0f9dc7 is the first bad commit >> >> Let me know if I can debug further, > > > That commit was a rewrite of np.concatenate, and I've traced the test > function you got the crash in. The only call to concatenate is as follows: > >>>> a = np.array([True], dtype=object) >>>> np.concatenate((a,)*3) > array([True, True, True], dtype=object) >>>> > > Can you try this and see if it crashes? No, that doesn't crash. Further investigation revealed the crash to be: (bare-env)[matthew@vagus ~]$ nosetests ~/dev_trees/numpy/numpy/lib/tests/test_io.py:TestFromTxt.test_with_masked_column_various nose.config: INFO: Ignoring files matching ['^\\.', '^_', '^setup\\.py$'] Test masked column ... Bus error Accordingly: In [1]: import numpy as np In [2]: from StringIO import StringIO In [3]: data = StringIO('True 2 3\nFalse 5 6\n') In [4]: test = np.genfromtxt(data, dtype=None, missing_values='2,5', usemask=True) In [6]: from numpy import ma In [7]: control = ma.array([(1, 2, 3), (0, 5, 6)], mask=[(0, 1, 0), (0, 1, 0)], dtype=[('f0', bool), ('f1', bool), ('f2', int)]) In [8]: test == control Bus error > Another thing you can do is compile with debug information enabled, then run > the crashing case in gdb. This will look something like this: > > $ export CFLAGS=-g > $ rm -rf build # make sure it's a fresh build from scratch > $ python setup.py install --prefix= # or however you do it > [... build printout] > $ gdb python
Re: [Numpy-discussion] Bus error for Debian / SPARC on current trunk
Hi, On Mon, Mar 5, 2012 at 11:11 AM, Matthew Brett wrote: > Hi, > > On Sun, Mar 4, 2012 at 11:53 PM, Mark Wiebe wrote: >> On Sun, Mar 4, 2012 at 10:34 PM, Matthew Brett >> wrote: >>> >>> >>> > $ export NPY_SEPARATE_COMPILATION=1 >>> >>> Thanks, that did it: >>> >>> 9194b3af704df71aa9b1ff2f53f169848d0f9dc7 is the first bad commit >>> >>> Let me know if I can debug further, >> >> >> That commit was a rewrite of np.concatenate, and I've traced the test >> function you got the crash in. The only call to concatenate is as follows: >> >>>>> a = np.array([True], dtype=object) >>>>> np.concatenate((a,)*3) >> array([True, True, True], dtype=object) >>>>> >> >> Can you try this and see if it crashes? > > No, that doesn't crash. > > Further investigation revealed the crash to be: > > (bare-env)[matthew@vagus ~]$ nosetests > ~/dev_trees/numpy/numpy/lib/tests/test_io.py:TestFromTxt.test_with_masked_column_various > nose.config: INFO: Ignoring files matching ['^\\.', '^_', '^setup\\.py$'] > Test masked column ... Bus error > > Accordingly: > > In [1]: import numpy as np > > In [2]: from StringIO import StringIO > > In [3]: data = StringIO('True 2 3\nFalse 5 6\n') > > In [4]: test = np.genfromtxt(data, dtype=None, missing_values='2,5', > usemask=True) > > In [6]: from numpy import ma > > In [7]: control = ma.array([(1, 2, 3), (0, 5, 6)], mask=[(0, 1, 0), > (0, 1, 0)], dtype=[('f0', bool), ('f1', bool), ('f2', int)]) > > In [8]: test == control > Bus error > >> Another thing you can do is compile with debug information enabled, then run >> the crashing case in gdb. This will look something like this: >> >> $ export CFLAGS=-g >> $ rm -rf build # make sure it's a fresh build from scratch >> $ python setup.py install --prefix= # or however you do it >> [... build printout] >> $ gdb python
Re: [Numpy-discussion] Bus error for Debian / SPARC on current trunk
And simplifying: In [1]: import numpy as np In [2]: control = np.array([(1, 2, 3), (0, 5, 6)], dtype=[('f0', bool), ('f1', bool), ('f2', int)]) In [3]: control == control Out[3]: array([ True, True], dtype=bool) In [4]: from numpy import ma In [5]: control = ma.array([(1, 2, 3), (0, 5, 6)], dtype=[('f0', bool), ('f1', bool), ('f2', int)]) In [6]: control == control Bus error Cheers, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Bus error for Debian / SPARC on current trunk
Hi, On Mon, Mar 5, 2012 at 8:04 PM, Mark Wiebe wrote: > I've pushed a bugfix to github, can you confirm that the crash goes away on > your test box? Thanks for tracking that down, the stack trace was very > helpful. Since x86 machines don't have as strict alignment requirements, > bugs like this one will generally remain undetected until someone tests on > an architecture like sparc. Thanks - no Bus error. For your enjoyment: there were some failures and errors from numpy.test("full") == ERROR: test_numeric.TestIsclose.test_ip_isclose_allclose([1e-08, 1, 120.99], [0, nan, 100.0]) -- Traceback (most recent call last): File "/home/matthew/.virtualenvs/np-devel/lib/python2.6/site-packages/nose-1.1.2-py2.6.egg/nose/case.py", line 197, in runTest self.test(*self.arg) File "/home/matthew/.virtualenvs/np-devel/lib/python2.6/site-packages/numpy/core/tests/test_numeric.py", line 1288, in tst_isclose_allclose assert_array_equal(isclose(x, y).all(), allclose(x, y), msg % (x, y)) File "/home/matthew/.virtualenvs/np-devel/lib/python2.6/site-packages/numpy/core/numeric.py", line 2020, in allclose return all(less_equal(absolute(x-y), atol + rtol * absolute(y))) RuntimeWarning: invalid value encountered in absolute == ERROR: test_numeric.TestIsclose.test_ip_isclose_allclose(nan, [nan, nan, nan]) -- Traceback (most recent call last): File "/home/matthew/.virtualenvs/np-devel/lib/python2.6/site-packages/nose-1.1.2-py2.6.egg/nose/case.py", line 197, in runTest self.test(*self.arg) File "/home/matthew/.virtualenvs/np-devel/lib/python2.6/site-packages/numpy/core/tests/test_numeric.py", line 1288, in tst_isclose_allclose assert_array_equal(isclose(x, y).all(), allclose(x, y), msg % (x, y)) File "/home/matthew/.virtualenvs/np-devel/lib/python2.6/site-packages/numpy/core/numeric.py", line 2020, in allclose return all(less_equal(absolute(x-y), atol + rtol * absolute(y))) RuntimeWarning: invalid value encountered in absolute == ERROR: Test a special case for var -- Traceback (most recent call last): File "/home/matthew/.virtualenvs/np-devel/lib/python2.6/site-packages/numpy/ma/tests/test_core.py", line 2725, in test_varstd_specialcases _ = method(out=nout) File "/home/matthew/.virtualenvs/np-devel/lib/python2.6/site-packages/numpy/ma/core.py", line 4778, in std dvar = sqrt(dvar) File "/home/matthew/.virtualenvs/np-devel/lib/python2.6/site-packages/numpy/ma/core.py", line 849, in __call__ m |= self.domain(d) File "/home/matthew/.virtualenvs/np-devel/lib/python2.6/site-packages/numpy/ma/core.py", line 801, in __call__ return umath.less(x, self.critical_value) RuntimeWarning: invalid value encountered in less == FAIL: test_complex_dtype_repr (test_dtype.TestString) -- Traceback (most recent call last): File "/home/matthew/.virtualenvs/np-devel/lib/python2.6/site-packages/numpy/core/tests/test_dtype.py", line 401, in test_complex_dtype_repr "dtype([('a', 'M8[D]'), ('b', '>m8[us]')])" DESIRED: "dtype([('a', 'm8[D]'), ('b', '>M8[us]')]" DESIRED: "[('a', 'http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] More SPARC pain
Hi, I found this test caused a bus error on current trunk: import numpy as np from StringIO import StringIO as BytesIO from numpy.testing import assert_array_equal def test_2d_buf(): dtt = np.complex64 arr = np.arange(10, dtype=dtt) # 2D array arr2 = np.reshape(arr, (2, 5)) # Fortran write followed by (C or F) read caused bus error data_str = arr2.tostring('F') data_back = np.ndarray(arr2.shape, arr2.dtype, buffer=data_str, order='F') assert_array_equal(arr2, data_back) gdb run gives ... test_me3.test_2d_buf ... Program received signal SIGBUS, Bus error. 0xf78f5458 in _aligned_strided_to_contig_size8 ( dst=0xdc0e08 "\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\373\373\373\373", dst_stride=8, src=0xcdfc44 "", src_stride=16, N=5, __NPY_UNUSED_TAGGEDsrc_itemsize=8, __NPY_UNUSED_TAGGEDdata=0x0) at numpy/core/src/multiarray/lowlevel_strided_loops.c.src:137 137 (*((@type@ *)dst)) = @swap@@elsize@(*((@type@ *)src)); Debug log attached. Shall I make an issue? Best, Matthew buf_2d.log.gz Description: GNU Zip compressed data ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Missing data again
Hi, On Wed, Mar 7, 2012 at 11:37 AM, Charles R Harris wrote: > > > On Wed, Mar 7, 2012 at 12:26 PM, Nathaniel Smith wrote: >> >> On Wed, Mar 7, 2012 at 5:17 PM, Charles R Harris >> wrote: >> > On Wed, Mar 7, 2012 at 9:35 AM, Pierre Haessig >> > >> >> Coming back to Travis proposition "bit-pattern approaches to missing >> >> data (*at least* for float64 and int32) need to be implemented.", I >> >> wonder what is the amount of extra work to go from nafloat64 to >> >> nafloat32/16 ? Is there an hardware support NaN payloads with these >> >> smaller floats ? If not, or if it is too complicated, I feel it is >> >> acceptable to say "it's too complicated" and fall back to mask. One may >> >> have to choose between fancy types and fancy NAs... >> > >> > I'm in agreement here, and that was a major consideration in making a >> > 'masked' implementation first. >> >> When it comes to "missing data", bitpatterns can do everything that >> masks can do, are no more complicated to implement, and have better >> performance characteristics. >> > > Maybe for float, for other things, no. And we have lots of otherthings. The > performance is a strawman, and it *isn't* easier to implement. > >> >> > Also, different folks adopt different values >> > for 'missing' data, and distributing one or several masks along with the >> > data is another common practice. >> >> True, but not really relevant to the current debate, because you have >> to handle such issues as part of your general data import workflow >> anyway, and none of these is any more complicated no matter which >> implementations are available. >> >> > One inconvenience I have run into with the current API is that is should >> > be >> > easier to clear the mask from an "ignored" value without taking a new >> > view >> > or assigning known data. So maybe two types of masks (different >> > payloads), >> > or an additional flag could be helpful. The process of assigning masks >> > could >> > also be made a bit easier than using fancy indexing. >> >> So this, uh... this was actually the whole goal of the "alterNEP" >> design for masks -- making all this stuff easy for people (like you, >> apparently?) that want support for ignored values, separately from >> missing data, and want a nice clean API for it. Basically having a >> separate .mask attribute which was an ordinary, assignable array >> broadcastable to the attached array's shape. Nobody seemed interested >> in talking about it much then but maybe there's interest now? >> > > Come off it, Nathaniel, the problem is minor and fixable. The intent of the > initial implementation was to discover such things. These things are less > accessible with the current API *precisely* because of the feedback from R > users. It didn't start that way. > > We now have something to evolve into what we want. That is a heck of a lot > more useful than endless discussion. The endless discussion is for the following reason: - The discussion was never adequately resolved. The discussion was never adequately resolved because there was not enough work done to understand the various arguments. In particular, you've several times said things that indicate to me, as to Nathaniel, that you either have not read or have not understood the points that Nathaniel was making. Travis' recent email - to me - also indicates that there is still a genuine problem here that has not been adequately explored. There is no future in trying to stop discussion, and trying to do so will only prolong it and make it less useful. It will make the discussion - endless. If you want to help - read the alterNEP, respond to it directly, and further the discussion by engaged debate. Best, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] Casting rules changed in trunk?
Hi, I noticed a casting change running the test suite on our image reader, nibabel: https://github.com/nipy/nibabel/blob/master/nibabel/tests/test_casting.py For this script: import numpy as np Adata = np.zeros((2,), dtype=np.uint8) Bdata = np.zeros((2,), dtype=np.int16) Bzero = np.int16(0) Bbig = np.int16(256) print np.__version__ print 'Array add', (Adata + Bdata).dtype print 'Scalar 0 add', (Adata + Bzero).dtype print 'Scalar 256 add', (Adata + Bbig).dtype 1.4.1 Array add int16 Scalar 0 add uint8 Scalar 256 add uint8 1.5.1 Array add int16 Scalar 0 add uint8 Scalar 256 add uint8 1.6.1 Array add int16 Scalar 0 add uint8 Scalar 256 add int16 1.7.0.dev-aae5b0a Array add int16 Scalar 0 add uint8 Scalar 256 add uint16 I can understand the uint8 outputs from numpy < 1.6 - the rule being not to upcast for scalars. I can understand the int16 output from 1.6.1 on the basis that the value is outside uint8 range and therefore we might prefer a type that can handle values from both uint8 and int16. Was the current change intended? It has the following odd effect: In [5]: Adata + np.int16(257) Out[5]: array([257, 257], dtype=uint16) In [7]: Adata + np.int16(-257) Out[7]: array([-257, -257], dtype=int16) In [8]: Adata - np.int16(257) Out[8]: array([65279, 65279], dtype=uint16) but I guess you can argue that there are odd effects for other choices too, Best, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Casting rules changed in trunk?
Hi, On Wed, Mar 7, 2012 at 4:08 PM, Matthew Brett wrote: > Hi, > > I noticed a casting change running the test suite on our image reader, > nibabel: > https://github.com/nipy/nibabel/blob/master/nibabel/tests/test_casting.py > > For this script: > > > import numpy as np > > Adata = np.zeros((2,), dtype=np.uint8) > Bdata = np.zeros((2,), dtype=np.int16) > Bzero = np.int16(0) > Bbig = np.int16(256) > > print np.__version__ > print 'Array add', (Adata + Bdata).dtype > print 'Scalar 0 add', (Adata + Bzero).dtype > print 'Scalar 256 add', (Adata + Bbig).dtype > > > 1.4.1 > Array add int16 > Scalar 0 add uint8 > Scalar 256 add uint8 > > 1.5.1 > Array add int16 > Scalar 0 add uint8 > Scalar 256 add uint8 > > 1.6.1 > Array add int16 > Scalar 0 add uint8 > Scalar 256 add int16 > > 1.7.0.dev-aae5b0a > Array add int16 > Scalar 0 add uint8 > Scalar 256 add uint16 > > I can understand the uint8 outputs from numpy < 1.6 - the rule being > not to upcast for scalars. > > I can understand the int16 output from 1.6.1 on the basis that the > value is outside uint8 range and therefore we might prefer a type that > can handle values from both uint8 and int16. > > Was the current change intended? It has the following odd effect: > > In [5]: Adata + np.int16(257) > Out[5]: array([257, 257], dtype=uint16) > > In [7]: Adata + np.int16(-257) > Out[7]: array([-257, -257], dtype=int16) > > In [8]: Adata - np.int16(257) > Out[8]: array([65279, 65279], dtype=uint16) > > but I guess you can argue that there are odd effects for other choices too, In case it wasn't clear, this, in numpy 1.6.1: In [2]: (np.zeros((2,), dtype=np.uint8) + np.int16(257)).dtype Out[2]: dtype('int16') changed to this in current trunk: In [2]: (np.zeros((2,), dtype=np.uint8) + np.int16(257)).dtype Out[2]: dtype('uint16') which is different still in previous versions of numpy (e.g. 1.4.1): In [2]: (np.zeros((2,), dtype=np.uint8) + np.int16(257)).dtype Out[2]: dtype('uint8') My impression had been that the plan was to avoid changes in the casting rules if possible. Was this change in trunk intentional? If not, I am happy to bisect, Best, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] More SPARC pain
Hi, On Tue, Mar 6, 2012 at 8:07 PM, Matthew Brett wrote: > Hi, > > I found this test caused a bus error on current trunk: > > > import numpy as np > > from StringIO import StringIO as BytesIO > > from numpy.testing import assert_array_equal > > > def test_2d_buf(): > dtt = np.complex64 > arr = np.arange(10, dtype=dtt) > # 2D array > arr2 = np.reshape(arr, (2, 5)) > # Fortran write followed by (C or F) read caused bus error > data_str = arr2.tostring('F') > data_back = np.ndarray(arr2.shape, > arr2.dtype, > buffer=data_str, > order='F') > assert_array_equal(arr2, data_back) > > > gdb run gives ... > > test_me3.test_2d_buf ... > Program received signal SIGBUS, Bus error. > 0xf78f5458 in _aligned_strided_to_contig_size8 ( > dst=0xdc0e08 > "\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\313\373\373\373\373", > dst_stride=8, src=0xcdfc44 "", src_stride=16, N=5, > __NPY_UNUSED_TAGGEDsrc_itemsize=8, > __NPY_UNUSED_TAGGEDdata=0x0) at > numpy/core/src/multiarray/lowlevel_strided_loops.c.src:137 > 137 (*((@type@ *)dst)) = @swap@@elsize@(*((@type@ *)src)); > > Debug log attached. Shall I make an issue? http://projects.scipy.org/numpy/ticket/2076 Best, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Casting rules changed in trunk?
Hi, On Thu, Mar 8, 2012 at 3:14 PM, Matthew Brett wrote: > Hi, > > On Wed, Mar 7, 2012 at 4:08 PM, Matthew Brett wrote: >> Hi, >> >> I noticed a casting change running the test suite on our image reader, >> nibabel: >> https://github.com/nipy/nibabel/blob/master/nibabel/tests/test_casting.py >> >> For this script: >> >> >> import numpy as np >> >> Adata = np.zeros((2,), dtype=np.uint8) >> Bdata = np.zeros((2,), dtype=np.int16) >> Bzero = np.int16(0) >> Bbig = np.int16(256) >> >> print np.__version__ >> print 'Array add', (Adata + Bdata).dtype >> print 'Scalar 0 add', (Adata + Bzero).dtype >> print 'Scalar 256 add', (Adata + Bbig).dtype >> >> >> 1.4.1 >> Array add int16 >> Scalar 0 add uint8 >> Scalar 256 add uint8 >> >> 1.5.1 >> Array add int16 >> Scalar 0 add uint8 >> Scalar 256 add uint8 >> >> 1.6.1 >> Array add int16 >> Scalar 0 add uint8 >> Scalar 256 add int16 >> >> 1.7.0.dev-aae5b0a >> Array add int16 >> Scalar 0 add uint8 >> Scalar 256 add uint16 >> >> I can understand the uint8 outputs from numpy < 1.6 - the rule being >> not to upcast for scalars. >> >> I can understand the int16 output from 1.6.1 on the basis that the >> value is outside uint8 range and therefore we might prefer a type that >> can handle values from both uint8 and int16. >> >> Was the current change intended? It has the following odd effect: >> >> In [5]: Adata + np.int16(257) >> Out[5]: array([257, 257], dtype=uint16) >> >> In [7]: Adata + np.int16(-257) >> Out[7]: array([-257, -257], dtype=int16) >> >> In [8]: Adata - np.int16(257) >> Out[8]: array([65279, 65279], dtype=uint16) >> >> but I guess you can argue that there are odd effects for other choices too, > > In case it wasn't clear, this, in numpy 1.6.1: > > In [2]: (np.zeros((2,), dtype=np.uint8) + np.int16(257)).dtype > Out[2]: dtype('int16') > > changed to this in current trunk: > > In [2]: (np.zeros((2,), dtype=np.uint8) + np.int16(257)).dtype > Out[2]: dtype('uint16') > > which is different still in previous versions of numpy (e.g. 1.4.1): > > In [2]: (np.zeros((2,), dtype=np.uint8) + np.int16(257)).dtype > Out[2]: dtype('uint8') > > My impression had been that the plan was to avoid changes in the > casting rules if possible. > > Was this change in trunk intentional? If not, I am happy to bisect, OK - I will assume it was unintentional and make a bug report, Best, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] float96 on windows32 is float64?
Hi, Am I right in thinking that float96 on windows 32 bit is a float64 padded to 96 bits? If so, is it useful? Has anyone got a windows64 box to check float128 ? Python 2.7.1 (r271:86832, Nov 27 2010, 18:30:46) [MSC v.1500 32 bit (Intel)] on win32 Type "help", "copyright", "credits" or "license" for more information. >>> import numpy as np >>> np.__version__ '1.6.1' >>> info = np.finfo(np.float96) >>> print info Machine parameters for float96 - precision= 15 resolution= 1e-15 machep= -52 eps=2.22044604925e-16 negep = -53 epsneg= 1.11022302463e-16 minexp=-16382 tiny= 0.0 maxexp= 16384 max=1.#INF nexp =15 min=-max - >>> info.nmant 52 Confirming 52 (+1 implicit) significand digits >>> np.float96(2**52)+1 4503599627370497.0 >>> np.float96(2**53)+1 9007199254740992.0 float96 claims 15 exponent digits (nexp above), but in fact it appears to have 11, as does float64 >>> np.float64(2**1022) * 2 8.9884656743115795e+307 >>> np.float64(2**1022) * 4 __main__:1: RuntimeWarning: overflow encountered in double_scalars inf >>> np.float96(2**1022) * 2 8.9884656743115795e+307 >>> np.float96(2**1022) * 4 1.#INF It does take up 12 bytes (96 bits) >>> np.dtype(np.float96).itemsize 12 Thanks, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] float96 on windows32 is float64?
Hi, On Thu, Mar 15, 2012 at 9:17 PM, David Cournapeau wrote: > > > On Thu, Mar 15, 2012 at 11:10 PM, Matthew Brett > wrote: >> >> Hi, >> >> Am I right in thinking that float96 on windows 32 bit is a float64 >> padded to 96 bits? > > > Yes > >> >> If so, is it useful? > > > Yes: this is what allows you to use dtype to parse complex binary files > directly in numpy without having to care so much about those details. And > that's how it is defined on windows in any case (C standard only forces you > to have sizeof(long double) >= sizeof(double)). I propose then to rename this one to float64_96 . The nexp value in finfo(np.float96) is incorrect I believe, I'll make a ticket for it. >> Has anyone got a windows64 >> box to check float128 ? > > > Too lazy to check on my vm, but I am pretty sure it is 16 bytes on windows > 64. Should you have time to do that, could you confirm it's also a padded float64 and that nexp is still (incorrectly) 15? That would be a great help, Thanks, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] float96 on windows32 is float64?
Hi, On Thu, Mar 15, 2012 at 9:24 PM, Charles R Harris wrote: > > > On Thu, Mar 15, 2012 at 10:17 PM, David Cournapeau > wrote: >> >> >> >> On Thu, Mar 15, 2012 at 11:10 PM, Matthew Brett >> wrote: >>> >>> Hi, >>> >>> Am I right in thinking that float96 on windows 32 bit is a float64 >>> padded to 96 bits? >> >> >> Yes >> >>> >>> If so, is it useful? >> >> >> Yes: this is what allows you to use dtype to parse complex binary files >> directly in numpy without having to care so much about those details. And >> that's how it is defined on windows in any case (C standard only forces you >> to have sizeof(long double) >= sizeof(double)). >> >> >>> >>> Has anyone got a windows64 >>> box to check float128 ? >> >> >> Too lazy to check on my vm, but I am pretty sure it is 16 bytes on windows >> 64. >> > > Wait, MSVC doesn't support extended precision, so how do we get doubles > padded to 96 bits? I think MINGW supports extended precision but the MS > libraries won't. Still, if it's doubles it should be 64 bits and float96 > shouldn't exist. Doubles padded to 96 bits are 150% pointless. I think David is arguing that longdouble for MSVC is indeed a 96 bit padded float64 unless I misunderstand him. If we were thinking of trade-offs I suppose one could argue that the confusion and wasted memory of float96 might outweigh the simple ability to read in binary files containing these values, on the basis that one can do it anyway (by using a structured array dtype) and that such files must be very rare in practice. See you, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] float96 on windows32 is float64?
Hi, On Thu, Mar 15, 2012 at 9:33 PM, Val Kalatsky wrote: > > I just happened to have an xp64 VM running: > My version of numpy (1.6.1) does not have float128 (see more below what I > get in ipython session). > If you need to test something else please let me know. Thanks a lot - that's helpful. What do you get for: print np.finfo(np.longdouble) ? Best, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] float96 on windows32 is float64?
Hi, On Thu, Mar 15, 2012 at 9:41 PM, Val Kalatsky wrote: > I does look like a joke. > Here is print np.finfo(np.longdouble) > > In [2]: np.__version__ > Out[2]: '1.6.1' > > In [3]: np.flo > np.float np.float32 np.float_ np.floor > np.float16 np.float64 np.floating np.floor_divide > > In [3]: print np.finfo(np.longdouble) > Machine parameters for float64 > - > precision= 15 resolution= 1e-15 > machep= -52 eps= 2.22044604925e-16 > negep = -53 epsneg= 1.11022302463e-16 > minexp= -1022 tiny= 2.22507385851e-308 > maxexp= 1024 max= 1.79769313486e+308 > nexp = 11 min= -max > - Great - much easier on the eye - longdouble is float64 as expected. Thanks, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] float96 on windows32 is float64?
Hi, On Thu, Mar 15, 2012 at 10:17 PM, Ilan Schnell wrote: > I'm seeing the same thing on both (64 and 32-bit) Windows > EPD test machines. I guess Windows does not support 128 > bit floats. Do you mean there is no float96 on windows 32 bit as I described at the beginning of the thread? > I did some tests a few weeks ago, and discovered that also > on the Mac and Linux long long double is not really 128 bits. > If I remember correctly it was 80 bits: 1 (sign) + 16 (exp) + 63 (mantissa) Yes, that's right, on Intel for linux and OSX longdouble is 80 bit precision. Thanks, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] float96 on windows32 is float64?
Hi, On Thu, Mar 15, 2012 at 10:26 PM, Ilan Schnell wrote: > To be more precise. On both 32-bit and 64-bit Windows > machines I don't see.float96 as well as np.float128 Do you have any idea why I am seeing float96 and you are not? I'm on XP with the current sourceforge 1.6.1 exe installer with python.org 2.7 (and same for python.org 2.6 and numpy 1.5.1). Thanks, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] float96 on windows32 is float64?
Hi, On Fri, Mar 16, 2012 at 5:36 AM, wrote: > On Fri, Mar 16, 2012 at 2:10 AM, Ilan Schnell wrote: >> I just did a quick test across all supported EPD platforms: >> win-64: float96 No, float128 No >> win-32: float96 No, float128 No >> osx-64: float96 No, float128 Yes >> osx-32: float96 No, float128 Yes >> rh3-64: float96 No, float128 Yes >> rh3-32: float96 Yes, float128 No >> rh5-64: float96 No, float128 Yes >> rh5-32: float96 Yes, float128 No >> sol-64: float96 No, float128 Yes >> sol-32: float96 Yes, float128 No > > > numpy 1.5.1 MingW, on python 2.6 win32 has float96, float128 no > numpy 1.6.1 Gohlke (MKL I think) on python 3.2 win64 no float96, no float128 > > Josef > >> >> I have no explanation for this, but I'm guessing David C. has. >> I'll look more into this tomorrow. >> >> - Ilan Oh dear - I completely forgot the previous thread that I started on this : http://mail.scipy.org/pipermail/numpy-discussion/2011-November/059233.html You young people, don't laugh, this will happen to you one day. Anyway, summarizing, it appears that windows float96: a) Is stored as an 80 bit extended precision number b) Uses float64 precision for all calculations. c) Is specific to MingW builds of numpy - I think. Perhaps David C you'll correct me if I've got that wrong, Best, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] Build error on OSX from commit 72c6fbd
Hi, As of commit 72c6fbd, I am getting the appended build error on OSX 10.6.8. I couldn't immediately see what might have caused the problem. Cheers, Matthew ... creating build/temp.macosx-10.3-fat-2.6/numpy/core/blasdot compile options: '-DNO_ATLAS_INFO=3 -Inumpy/core/blasdot -Inumpy/core/include -Ibuild/src.macosx-10.3-fat-2.6/numpy/core/include/numpy -Inumpy/core/src/private -Inumpy/core/src -Inumpy/core -Inumpy/core/src/npymath -Inumpy/core/src/multiarray -Inumpy/core/src/umath -Inumpy/core/src/npysort -Inumpy/core/include -I/Library/Frameworks/Python.framework/Versions/2.6/include/python2.6 -Ibuild/src.macosx-10.3-fat-2.6/numpy/core/src/multiarray -Ibuild/src.macosx-10.3-fat-2.6/numpy/core/src/umath -c' extra options: '-msse3 -I/System/Library/Frameworks/vecLib.framework/Headers' gcc-4.0: numpy/core/blasdot/_dotblas.c In file included from numpy/core/include/numpy/ndarraytypes.h:1972, from numpy/core/include/numpy/ndarrayobject.h:17, from numpy/core/include/numpy/arrayobject.h:15, from numpy/core/blasdot/_dotblas.c:6: numpy/core/include/numpy/npy_deprecated_api.h:11:2: warning: #warning "Using deprecated NumPy API, disable it by #defining NPY_NO_DEPRECATED_API" In file included from numpy/core/include/numpy/ndarraytypes.h:1972, from numpy/core/include/numpy/ndarrayobject.h:17, from numpy/core/include/numpy/arrayobject.h:15, from numpy/core/blasdot/_dotblas.c:6: numpy/core/include/numpy/npy_deprecated_api.h:11:2: warning: #warning "Using deprecated NumPy API, disable it by #defining NPY_NO_DEPRECATED_API" numpy/core/blasdot/_dotblas.c: In function ‘dotblas_matrixproduct’: numpy/core/blasdot/_dotblas.c: In function ‘dotblas_matrixproduct’: numpy/core/blasdot/_dotblas.c:239: warning: comparison of distinct pointer types lacks a cast numpy/core/blasdot/_dotblas.c:257: warning: passing argument 3 of ‘*(PyArray_API + 1120u)’ from incompatible pointer type numpy/core/blasdot/_dotblas.c:292: warning: passing argument 3 of ‘*(PyArray_API + 1120u)’ from incompatible pointer type numpy/core/blasdot/_dotblas.c:239: warning: comparison of distinct pointer types lacks a cast numpy/core/blasdot/_dotblas.c:257: warning: passing argument 3 of ‘*(PyArray_API + 1120u)’ from incompatible pointer type numpy/core/blasdot/_dotblas.c:292: warning: passing argument 3 of ‘*(PyArray_API + 1120u)’ from incompatible pointer type gcc-4.0 -arch ppc -arch i386 -isysroot /Developer/SDKs/MacOSX10.4u.sdk -g -bundle -undefined dynamic_lookup build/temp.macosx-10.3-fat-2.6/numpy/core/blasdot/_dotblas.o -Lbuild/temp.macosx-10.3-fat-2.6 -o build/lib.macosx-10.3-fat-2.6/numpy/core/_dotblas.so -Wl, -framework -Wl,Accelerate ld: file not found: collect2: ld returned 1 exit status ld: file not found: collect2: ld returned 1 exit status lipo: can't open input file: /var/folders/jg/jgfZ12ZXHwGSFKD85xLpLk+++TI/-Tmp-//ccZil7bP.out (No such file or directory) ld: file not found: collect2: ld returned 1 exit status ld: file not found: collect2: ld returned 1 exit status lipo: can't open input file: /var/folders/jg/jgfZ12ZXHwGSFKD85xLpLk+++TI/-Tmp-//ccZil7bP.out (No such file or directory) error: Command "gcc-4.0 -arch ppc -arch i386 -isysroot /Developer/SDKs/MacOSX10.4u.sdk -g -bundle -undefined dynamic_lookup build/temp.macosx-10.3-fat-2.6/numpy/core/blasdot/_dotblas.o -Lbuild/temp.macosx-10.3-fat-2.6 -o build/lib.macosx-10.3-fat-2.6/numpy/core/_dotblas.so -Wl, -framework -Wl,Accelerate" failed with exit status 1 ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Build error on OSX from commit 72c6fbd
Hi, On Sat, Mar 17, 2012 at 2:10 AM, Ralf Gommers wrote: > > > On Sat, Mar 17, 2012 at 9:24 AM, Matthew Brett > wrote: >> >> Hi, >> >> As of commit 72c6fbd, I am getting the appended build error on OSX >> 10.6.8. I couldn't immediately see what might have caused the >> problem. > > > I can't reproduce it, but it should be fixed by > https://github.com/rgommers/numpy/commit/bca298fb3. Can you confirm that? Yes that fixes it - thanks for the quick reply, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] One question about the numpy.linalg.eig() routine
Hi, 2012/4/2 Hongbin Zhang : > Dear Python-users, > > I am currently very confused about the Scipy routine to obtain the > eigenvectors of a complex matrix. > In attached you find two files to diagonalize a 2X2 complex Hermitian > matrix, however, on my computer, > > If I run python, I got: > > [[ 0.80322132+0.j 0.59500941+0.02827207j] > [-0.59500941+0.02827207j 0.80322132+0.j ]] > > If I compile the fortran code, I got: > > ( -0.595009410289, -0.028272068905) ( 0.802316135182, 0.038122316497) > ( -0.803221321796, 0.) ( -0.595680709955, 0.) > > From the scipy webpage, it is said that numpy.linalg.eig() provides nothing > but > an interface to lapack zheevd subroutine, which is used in my fortran code. > > < /div> > Would somebody be kind to tell me how to get consistent results? I should also point out that matlab and octave give the same answer as your Fortran routine: octave:15> H=[0.6+0.0j, -1.97537668-0.09386068j; -1.97537668+0.09386068j, -0.6+0.0j] H = 0.6 + 0.0i -1.97538 - 0.09386i -1.97538 + 0.09386i -0.6 + 0.0i Best, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] One question about the numpy.linalg.eig() routine
Hi, On Mon, Apr 2, 2012 at 5:38 PM, Val Kalatsky wrote: > Both results are correct. > There are 2 factors that make the results look different: > 1) The order: the 2nd eigenvector of the numpy solution corresponds to the > 1st eigenvector of your solution, > note that the vectors are written in columns. > 2) The phase: an eigenvector can be multiplied by an arbitrary phase factor > with absolute value = 1. > As you can see this factor is -1 for the 2nd eigenvector > and -0.99887305445887753-0.047461785427773337j for the other one. Thanks for this answer; for my own benefit: Definition: A . v = L . v where A is the input matrix, L is an eigenvalue of A and v is an eigenvector of A. http://en.wikipedia.org/wiki/Eigendecomposition_of_a_matrix In [63]: A = [[0.6+0.0j, -1.97537668-0.09386068j],[-1.97537668+0.09386068j, -0.6+0.0j]] In [64]: L, v = np.linalg.eig(A) In [66]: np.allclose(np.dot(A, v), L * v) Out[66]: True Best, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Slice specified axis
Hi, On Fri, Apr 6, 2012 at 1:12 PM, Tony Yu wrote: > > > On Fri, Apr 6, 2012 at 8:54 AM, Benjamin Root wrote: >> >> >> >> On Friday, April 6, 2012, Val Kalatsky wrote: >>> >>> >>> The only slicing short-cut I can think of is the Ellipsis object, but >>> it's not going to help you much here. >>> The alternatives that come to my mind are (1) manipulation of shape >>> directly and (2) building a string and running eval on it. >>> Your solution is better than (1), and (2) is a horrible hack, so your >>> solution wins again. >>> Cheers >>> Val >> >> >> Take a peek at how np.gradient() does it. It creates a list of None with >> a length equal to the number of dimensions, and then inserts a slice object >> in the appropriate spot in the list. >> >> Cheers! >> Ben Root > > > Hmm, it looks like my original implementation wasn't too far off. Thanks for > the tip! Another option: me_first = np.rollaxis(arr, axis) slice = me_first[start:end] slice = np.rollaxis(slice, 0, axis+1) Best, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] speed of append_fields() in numpy.lib.recfunctions vs matplotlib.mlab
Hi, On Fri, Apr 6, 2012 at 3:50 PM, cgraves wrote: > > It seems that the speed of append_fields() in numpy.lib.recfunctions is much > slower than rec_append_fields() in matplotlib.mlab. See the following code: As I remember it (Pierre M can probably correct me) the recfunctions are not ports of the mlab functions, but are considerably extended in order to deal with masking, and do not have exactly the same API. When I noticed this I wondered if there would be some sensible way of making the mlab routines available in a separate namespace, but I did not pursue it. Cheers, matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Removing masked arrays for 1.7? (Was 1.7 blockers)
Hi, On Mon, Apr 16, 2012 at 3:06 PM, Travis Oliphant wrote: > I have heard from a few people that they are not excited by the growth of > the NumPy data-structure by the 3 pointers needed to hold the masked-array > storage. This is especially true when there is talk to potentially add > additional attributes to the NumPy array (for labels and other > meta-information). If you are willing to let us know how you feel about > this, please speak up. I guess there are two questions here 1) Will something like the current version of masked arrays have a long term future in numpy, regardless of eventual API? Most likely answer - yes? 2) Will likely changes to the masked array API make any difference to the number of extra pointers? Likely answer no? Is that right? I have the impression that the masked array API discussion still has not come out fully into the unforgiving light of discussion day, but if the answer to 2) is No, then I suppose the API discussion is not relevant to the 3 pointers change. See y'all, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Removing masked arrays for 1.7? (Was 1.7 blockers)
Hi, On Mon, Apr 16, 2012 at 6:03 PM, Matthew Brett wrote: > Hi, > > On Mon, Apr 16, 2012 at 3:06 PM, Travis Oliphant wrote: > >> I have heard from a few people that they are not excited by the growth of >> the NumPy data-structure by the 3 pointers needed to hold the masked-array >> storage. This is especially true when there is talk to potentially add >> additional attributes to the NumPy array (for labels and other >> meta-information). If you are willing to let us know how you feel about >> this, please speak up. > > I guess there are two questions here > > 1) Will something like the current version of masked arrays have a > long term future in numpy, regardless of eventual API? Most likely > answer - yes? > 2) Will likely changes to the masked array API make any difference to > the number of extra pointers? Likely answer no? > > Is that right? > > I have the impression that the masked array API discussion still has > not come out fully into the unforgiving light of discussion day, but > if the answer to 2) is No, then I suppose the API discussion is not > relevant to the 3 pointers change. Sorry, if the answers to 1 and 2 are Yes and No then the API discussion may not be relevant. Cheers, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Removing masked arrays for 1.7? (Was 1.7 blockers)
Hi, On Mon, Apr 16, 2012 at 7:46 PM, Travis Oliphant wrote: > > On Apr 16, 2012, at 8:03 PM, Matthew Brett wrote: > >> Hi, >> >> On Mon, Apr 16, 2012 at 3:06 PM, Travis Oliphant wrote: >> >>> I have heard from a few people that they are not excited by the growth of >>> the NumPy data-structure by the 3 pointers needed to hold the masked-array >>> storage. This is especially true when there is talk to potentially add >>> additional attributes to the NumPy array (for labels and other >>> meta-information). If you are willing to let us know how you feel about >>> this, please speak up. >> >> I guess there are two questions here >> >> 1) Will something like the current version of masked arrays have a >> long term future in numpy, regardless of eventual API? Most likely >> answer - yes? > > I think the answer to this is yes, but it could be as a feature-filled > sub-class (like the current numpy.ma, except in C). I'd love to hear that argument fleshed out in more detail - do you have time? >> 2) Will likely changes to the masked array API make any difference to >> the number of extra pointers? Likely answer no? >> >> Is that right? > > The answer to this is very likely no on the Python side. But, on the C-side, > their could be some differences (i.e. are masked arrays a sub-class of the > ndarray or not). > >> >> I have the impression that the masked array API discussion still has >> not come out fully into the unforgiving light of discussion day, but >> if the answer to 2) is No, then I suppose the API discussion is not >> relevant to the 3 pointers change. > > You are correct that the API discussion is separate from this one. > Overall, I was surprised at how fervently people would oppose ABI changes. > As has been pointed out, NumPy and Numeric before it were not really designed > to prevent having to recompile when changes were made. I'm still not sure > that a better overall solution is not to promote better availability of > downstream binary packages than excessively worry about ABI changes in NumPy. > But, that is the current climate. The objectors object to any binary ABI change, but not specifically three pointers rather than two or one? Is their point then about ABI breakage? Because that seems like a different point again. Or is it possible that they are in fact worried about the masked array API? > Mark and I will talk about this long and hard. Mark has ideas about where he > wants to see NumPy go, but I don't think we have fully accounted for where > NumPy and its user base *is* and there may be better ways to approach this > evolution. If others are interested in the outcome of the discussion > please speak up (either on the list or privately) and we will make sure your > views get heard and accounted for. I started writing something about this but I guess you'd know what I'd write, so I only humbly ask that you consider whether it might be doing real damage to allow substantial discussion that is not documented or argued out in public. See you, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Removing masked arrays for 1.7? (Was 1.7 blockers)
Hi, On Mon, Apr 16, 2012 at 8:40 PM, Travis Oliphant wrote: >>> >>> I think the answer to this is yes, but it could be as a feature-filled >>> sub-class (like the current numpy.ma, except in C). >> >> I'd love to hear that argument fleshed out in more detail - do you have time? > > > My proposal here is to basically take the current github NumPy data-structure > and make this a sub-type (in C) of the NumPy 1.6 data-structure which is > unchanged in NumPy 1.7. > > This would not require removing code but would require another PyTypeObject > and associated structures. I expect Mark could do this work in 2-4 weeks. > We also have other developers who could help in order to get the sub-type in > NumPy 1.7. What kind of details would you like to see? I was dimly thinking of the same questions that Chuck had - about how subclassing would relate to the ufunc changes. > I just think we need more data and uses and this would provide a way to get > that without making a forced decision one way or another. Is the proposal that this would be an alternative API to numpy.ma? Is numpy.ma not itself satisfactory as a test of these uses, because of performance or some other reason? 2) Will likely changes to the masked array API make any difference to the number of extra pointers? Likely answer no? Is that right? >>> >>> The answer to this is very likely no on the Python side. But, on the >>> C-side, their could be some differences (i.e. are masked arrays a sub-class >>> of the ndarray or not). >>> I have the impression that the masked array API discussion still has not come out fully into the unforgiving light of discussion day, but if the answer to 2) is No, then I suppose the API discussion is not relevant to the 3 pointers change. >>> >>> You are correct that the API discussion is separate from this one. >>> Overall, I was surprised at how fervently people would oppose ABI changes. >>> As has been pointed out, NumPy and Numeric before it were not really >>> designed to prevent having to recompile when changes were made. I'm still >>> not sure that a better overall solution is not to promote better >>> availability of downstream binary packages than excessively worry about ABI >>> changes in NumPy. But, that is the current climate. >> >> The objectors object to any binary ABI change, but not specifically >> three pointers rather than two or one? > > Adding pointers is not really an ABI change (but removing them after they > were there would be...) It's really just the addition of data to the NumPy > array structure that they aren't going to use. Most of the time it would not > be a real problem (the number of use-cases where you have a lot of small > NumPy arrays is small), but when it is a problem it is very annoying. > >> >> Is their point then about ABI breakage? Because that seems like a >> different point again. > > Yes, it's not that. > >> >> Or is it possible that they are in fact worried about the masked array API? > > I don't think most people whose opinion would be helpful are really tuned in > to the discussion at this point. I think they just want us to come up with > an answer and then move forward. But, they will judge us based on the > solution we come up with. > >> >>> Mark and I will talk about this long and hard. Mark has ideas about where >>> he wants to see NumPy go, but I don't think we have fully accounted for >>> where NumPy and its user base *is* and there may be better ways to approach >>> this evolution. If others are interested in the outcome of the >>> discussion please speak up (either on the list or privately) and we will >>> make sure your views get heard and accounted for. >> >> I started writing something about this but I guess you'd know what I'd >> write, so I only humbly ask that you consider whether it might be >> doing real damage to allow substantial discussion that is not >> documented or argued out in public. > > It will be documented and argued in public. We are just going to have one > off-list conversation to try and speed up the process. You make a valid > point, and I appreciate the perspective. Please speak up again after > hearing the report if something is not clear. I don't want this to even > have the appearance of a "back-room" deal. > > Mark and I will have conversations about NumPy while he is in Austin. There > are many other active stake-holders whose opinions and views are essential > for major changes. Mark and I are working on other things besides just > NumPy and all NumPy changes will be discussed on list and require consensus > or super-majority for NumPy itself to change. I'm not sure if that helps. > Is there more we can do? As you might have heard me say before, my concern is that it has not been easy to have good discussions on this list. I think the problem has been that is has not been clear what the culture wa
Re: [Numpy-discussion] Removing masked arrays for 1.7? (Was 1.7 blockers)
Hi, On Tue, Apr 17, 2012 at 7:24 AM, Nathaniel Smith wrote: > On Tue, Apr 17, 2012 at 5:59 AM, Matthew Brett > wrote: >> Hi, >> >> On Mon, Apr 16, 2012 at 8:40 PM, Travis Oliphant wrote: >>> Mark and I will have conversations about NumPy while he is in Austin. >>> There are many other active stake-holders whose opinions and views are >>> essential for major changes. Mark and I are working on other things >>> besides just NumPy and all NumPy changes will be discussed on list and >>> require consensus or super-majority for NumPy itself to change. I'm not >>> sure if that helps. Is there more we can do? >> >> As you might have heard me say before, my concern is that it has not >> been easy to have good discussions on this list. I think the problem >> has been that is has not been clear what the culture was, and how >> decisions got made, and that had led to some uncomfortable and >> unhelpful discussions. My plea would be for you as BDF$N to strongly >> encourage on-list discussions and discourage off-list discussions as >> far as possible, and to help us make the difficult public effort to >> bash out the arguments to clarity and consensus. I know that's a big >> ask. > > Hi Matthew, > > As you know, I agree with everything you just said :-). So in interest > of transparency, I should add: I have been in touch with Travis some > off-list, and the main topic has been how to proceed in a way that > let's us achieve public consensus. I'm glad to hear that discussion is happening, but please do have it on list. If it's off list it easy for people to feel they are being bypassed, and that the public discussion is not important. So, yes, you might get a better outcome for this specific case, but a worse outcome in the long term, because the list will start to feel that it's for signing off or voting rather than discussion, and that - I feel sure - would lead to worse decisions. The other issue is that there's a reason you are having the discussion off-list - which is that it was getting difficult on-list. But - again - a personal view - that really has to be addressed directly by setting out the rules of engagement and modeling the kind of discussion we want to have. Cheers, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Removing masked arrays for 1.7? (Was 1.7 blockers)
Hi, On Tue, Apr 17, 2012 at 12:04 PM, Fernando Perez wrote: > On Tue, Apr 17, 2012 at 11:40 AM, Matthew Brett > wrote: >> I'm glad to hear that discussion is happening, but please do have it >> on list. If it's off list it easy for people to feel they are being >> bypassed, and that the public discussion is not important. > > I'm afraid I have to disagree: you seem to be proposing an absolute, > 'zero-tolerance'-style policy against any off-list discussion. The > only thing ZT policies achieve is to remove common sense and human > judgement from a process, invariably causing more harm than they do > good, no matter how well intentioned. Right - but that would be an absurd overstatement of what I said. There's no point in addressing something I didn't say and no sensible person would think. Indeed, it makes the discussion harder. It's just exhausting to have to keep stating the obvious. Of course discussions happen off-list. Of course sometimes that has to happen. Of course that can be a better and quicker way of having discussions. However, in this case the > Let's try to trust for one minute that the actual decisions will be > made here with solid debate and project-wide input, and seek change > only if we have evidence that this isn't happening (not evidence of a > meta-problem that isn't a problem here). meta-problem that is a real problem is that we've shown ourselves that we are not currently good at having discussions on list. There are clearly reasons for that, and also clearly, they can be addressed. The particular point I am making is neither silly nor extreme nor vapid. It is simply that, in order to make discussion work better on the list, it is in my view better to make an explicit effort to make the discussions - explicit. Yours in Bay Area opposition, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Removing masked arrays for 1.7? (Was 1.7 blockers)
On Tue, Apr 17, 2012 at 12:32 PM, Fernando Perez wrote: > On Tue, Apr 17, 2012 at 12:10 PM, Matthew Brett > wrote: >> Right - but that would be an absurd overstatement of what I said. >> There's no point in addressing something I didn't say and no sensible >> person would think. Indeed, it makes the discussion harder. > > Well, in that case neither Eric Firing nor I are 'sensible persons', > since that's how we both understood what you said (Eric's email > appeared to me as a more concise/better phrased version of the same > points I was making). You said: > > """ > I'm glad to hear that discussion is happening, but please do have it > on list. If it's off list it easy for people to feel they are being > bypassed, and that the public discussion is not important. > """ > > I don't think it's an 'absurd overstatement' to interpret that as > "don't have discussions off-list", but hey, it may just be me :) The absurd over-statement is the following: " I'm afraid I have to disagree: you seem to be proposing an absolute, 'zero-tolerance'-style policy against any off-list discussion. " >> meta-problem that is a real problem is that we've shown ourselves that >> we are not currently good at having discussions on list. > > Oh, I know that did happen in the past regarding this very topic (the > big NA mess last summer); what I meant was to try and trust that *this > time around* things might be already moving in a better direction, > which it seems to me they are. It seems to me that everyone is > genuinely trying to tackle the discussion/consensus questions head-on > right on the list, and that's why I proposed waiting to see if there > were really any problems before asking Nathaniel not to have any > discussion off-list (esp. since we have no evidence that what they > talked about had any impact on any decisions bypassing the open > forum). The question - which seems to me to be sensible rational and important - is how to get better at on-list discussion, and whether taking this particular discussion mainly off-list is good or bad in that respect. See you, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] Casting rules - an awkward case
Hi, I just wanted to point out a situation where the scalar casting rules can be a little confusing: In [113]: a - np.int16(128) Out[113]: array([-256, -1], dtype=int16) In [114]: a + np.int16(-128) Out[114]: array([ 0, -1], dtype=int8) This is predictable from the nice docs here: http://docs.scipy.org/doc/numpy/reference/generated/numpy.result_type.html but I offer it only as a speedbump I hit. On the other hand I didn't find it easy to predict what numpy 1.5.1 was going to do: In [31]: a - np.int16(1) Out[31]: array([127, 126], dtype=int8) In [32]: a + np.int16(-1) Out[32]: array([-129, 126], dtype=int16) As a matter of interest, what was the rule for 1.5.1? See you, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Casting rules - an awkward case
Oops, sorry, Keith Goodman kindly pointed out that I had missed out: On Wed, Apr 18, 2012 at 11:03 AM, Matthew Brett wrote: > Hi, > > I just wanted to point out a situation where the scalar casting rules > can be a little confusing: In [110]: a = np.array([-128, 127], dtype=np.int8) > In [113]: a - np.int16(128) > Out[113]: array([-256, -1], dtype=int16) > > In [114]: a + np.int16(-128) > Out[114]: array([ 0, -1], dtype=int8) > > This is predictable from the nice docs here: > > http://docs.scipy.org/doc/numpy/reference/generated/numpy.result_type.html > > but I offer it only as a speedbump I hit. > > On the other hand I didn't find it easy to predict what numpy 1.5.1 > was going to do: > > In [31]: a - np.int16(1) > Out[31]: array([127, 126], dtype=int8) > > In [32]: a + np.int16(-1) > Out[32]: array([-129, 126], dtype=int16) > > As a matter of interest, what was the rule for 1.5.1? Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] A 1.6.2 release?
Hi, On Fri, Apr 20, 2012 at 11:04 AM, Charles R Harris wrote: > Hi All, > > Given the amount of new stuff coming in 1.7 and the slip in it's schedule, I > wonder if it would be worth putting out a 1.6.2 release with fixes for > einsum, ticket 1578, perhaps some others. My reasoning is that the fall > releases of Fedora, Ubuntu are likely to still use 1.6 and they might as > well use a somewhat fixed up version. The downside is located and > backporting fixes is likely to be a fair amount of work. A 1.7 release would > be preferable, but I'm not sure when we can make that happen. Also, I believe Debian will very soon freeze "testing" in order to prepare to release the next stable. See you, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] What is consensus anyway
Hi, On Sun, Apr 22, 2012 at 3:15 PM, Nathaniel Smith wrote: > If you hang around big FOSS projects, you'll see the word "consensus" > come up a lot. For example, the glibc steering committee recently > dissolved itself in favor of governance "directly by the consensus of > the people active in glibc development"[1]. It's the governing rule of > the IETF, which defines many of the most important internet > standards[2]. It is the "primary way decisions are made on > Wikipedia"[3]. It's "one of the fundamental aspects of accomplishing > things within the Apache framework"[4]. > > [1] https://lwn.net/Articles/488778/ > [2] https://www.ietf.org/tao.html#getting.things.done > [3] https://en.wikipedia.org/wiki/Wikipedia:Consensus > [4] https://www.apache.org/foundation/voting.html I think the big problem here is that Chuck (I hope I'm not misrepresenting him) is not interested in discussion of process, and the last time we had a specific thread on governance, Travis strongly implied he was not very interested either, at least at the time. In that situation, there's rather a high threshold to pass before getting involved in the discussion, and I think you're seeing some evidence for that. So, as before, and as we discussed on gchat :) - whether this discussion can go anywhere depends on Travis. Travis - what do you think? See you, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] What is consensus anyway
Hi, On Mon, Apr 23, 2012 at 12:33 PM, Nathaniel Smith wrote: > On Mon, Apr 23, 2012 at 1:04 AM, Charles R Harris > wrote: >> Linux is Linus' private tree. Everything that goes in is his decision, >> everything that stays out is his decision. Of course, he delegates much of >> the work to people he trusts, but it doesn't even reach the level of a BDFL, >> it's DFL. As for consensus, it basically comes down to convincing the >> gatekeepers one level below Linus that your code might be useful. So bad >> example. Same with TCP/IP, which was basically Kahn and Cerf consulting with >> a few others and working by request of DARPA. GCC was Richard Stallman (I >> got one of the first tapes for a $30 donation), Python was Guido. Some of >> the projects later developed some form of governance but Guido, for >> instance, can veto anything he dislikes even if he is disinclined to do so. >> I'm not saying you're wrong about open source, I'm just saying that that >> each project differs and it is wrong to imply that they follow some common >> form of governance under the rubric FOSS and that they all seek consensus. >> And they certainly don't *start* that way. And there are also plenty of >> projects that fail when the prime mover loses interest or folks get tired of >> the politics. [snip] > Linux: Technically, everything you say is true. In practice, good luck > convincing Linus or a subsystem maintainer to accept your patch when > other people are raising substantive complaints. Here's an email I > googled up in a few moments, in which Linus yells at people for trying > to submit a patch to him without making sure that all interested > parties have agreed: > https://lkml.org/lkml/2009/9/14/481 > Stuff regularly sits outside the kernel tree in limbo for *years* > while people debate different approaches back and forth. To which I'd add: "In fact, for [Linus'] decisions to be received as legitimate, they have to be consistent with the consensus of the opinions of participating developers as manifest on Linux mailing lists. It is not unusual for him to back down from a decision under the pressure of criticism from other developers. His position is based on the recognition of his fitness by the community of Linux developers and this type of authority is, therefore, constantly subject to withdrawal. His role is not that of a boss or a manager in the usual sense. In the final analysis, the direction of the project springs from the cumulative synthesis of modifications contributed by individual developers." http://shareable.net/blog/governance-of-open-source-george-dafermos-interview See you, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] What is consensus anyway
Hi, On Mon, Apr 23, 2012 at 3:08 PM, Travis Oliphant wrote: >> >>> Linux: Technically, everything you say is true. In practice, good luck >>> convincing Linus or a subsystem maintainer to accept your patch when >>> other people are raising substantive complaints. Here's an email I >>> googled up in a few moments, in which Linus yells at people for trying >>> to submit a patch to him without making sure that all interested >>> parties have agreed: >>> https://lkml.org/lkml/2009/9/14/481 >>> Stuff regularly sits outside the kernel tree in limbo for *years* >>> while people debate different approaches back and forth. >> >> To which I'd add: >> >> "In fact, for [Linus'] decisions to be received as legitimate, they >> have to be consistent with the consensus of the opinions of >> participating developers as manifest on Linux mailing lists. It is not >> unusual for him to back down from a decision under the pressure of >> criticism from other developers. His position is based on the >> recognition of his fitness by the community of Linux developers and >> this type of authority is, therefore, constantly subject to >> withdrawal. His role is not that of a boss or a manager in the usual >> sense. In the final analysis, the direction of the project springs >> from the cumulative synthesis of modifications contributed by >> individual developers." >> http://shareable.net/blog/governance-of-open-source-george-dafermos-interview >> > > This is the model that I have for NumPy development. It is my view of how > NumPy has evolved already and how Numarray, and Numeric evolved before it as > well. I also feel like these things are fundamentally determined by the > people involved and by the personalities and styles of those who participate. > There certainly are globally applicable principles (like code review, > building consensus, and mutual respect) that are worth emphasizing over and > over again. If it helps let's write those down and say "these are the > principles we live by". I am suspicious that you can go beyond this in > formalizing the process as you ultimately are at the mercy of the people > involved and their judgment, anyway. I think writing it down would help enormously. For example, if you do agree to Nathaniel's view of consensus - *in principle* - and we write that down and agree, we have a document to appeal to when we next run into trouble.Maybe the document could say something like: """ We strive for consensus [some refs here]. Any substantial new feature is subject to consensus. Only if all avenues for consensus have been documented, and exhausted, will we [vote, defer to Travis, or some other tie-breaking thing]. """ Best, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] What is consensus anyway
Hi, On Tue, Apr 24, 2012 at 6:14 AM, Charles R Harris wrote: > > > On Mon, Apr 23, 2012 at 11:35 PM, Fernando Perez > wrote: >> >> On Mon, Apr 23, 2012 at 8:49 PM, Stéfan van der Walt >> wrote: >> > If you are referring to the traditional concept of a fork, and not to >> > the type we frequently make on GitHub, then I'm surprised that no one >> > has objected already. What would a fork solve? To paraphrase the >> > regexp saying: after forking, we'll simply have two problems. >> >> I concur with you here: github 'forks', yes, as many as possible! >> Hopefully every one of those will produce one or more PRs :) But a >> fork in the sense of a divergent parallel project? I think that would >> only be indicative of a complete failure to find a way to make >> progress here, and I doubt we're anywhere near that state. >> >> That forks are *possible* is indeed a valuable and important option in >> open source software, because it means that a truly dysfunctional >> original project team/direction can't hold a community hostage >> forever. But that doesn't mean that full-blown forks should be >> considered lightly, as they also carry enormous costs. >> >> I see absolutely nothing in the current scenario to even remotely >> consider that a full-blown fork would be a good idea, and I hope I'm >> right. It seems to me we're making progress on problems that led to >> real difficulties last year, but from multiple parties I see signs >> that give me reason to be optimistic that the project is getting >> better, not worse. >> > > We certainly aren't there at the moment, but I can see us heading that way. > But let's back up a bit. Numpy 1.6.0 came out just about 1 year ago. Since > then datetime, NA, polynomial work, and various other enhancements have gone > in along with some 280 bug fixes. The major technical problem blocking a 1.7 > release is getting datetime working reliably on windows. So I think that is > where the short term effort needs to be. Meanwhile, we are spending effort > to get out a 1.6.2 just so people can work with a stable version with some > of the bug fixes, and potentially we will spend more time and effort to pull > out the NA code. In the future there may be a transition to C++ and > eventually a break with the current ABI. Or not. > > There are at least two motivations that get folks to write code for open > source projects, scratching an itch and money. Money hasn't been a big part > of the Numpy picture so far, so that leaves scratching an itch. One of the > attractions of Numpy is that it is a small project, BSD licensed, and not > overburdened with governance and process. This makes scratching an itch not > as difficult as it would be in a large project. If Numpy remains a small > project but acquires the encumbrances of a big project much of that > attraction will be lost. Momentum and direction also attracts people, but > numpy is stalled at the moment as the whole NA thing circles around once > again. I think your assumptions are incorrect, although I have seen them before. No stated process leads to less encumbrance if and only if the implicit process works. It clearly doesn't work, precisely because the NA thing is circling round and round again. And the governance discussion. And previously the ABI breakage discussion. If you are on other mailing lists, I'm sure you are, you'll see that this does not happen to - say - Cython, or Sympy. In particular, I have not seen, on those lists, the current numpy way of simply blocking or avoiding discussion. Everything is discussed out to agreement, or at least until all parties accept the way forward. At the moment, the only hope I could imagine for the 'no governance is good governance' method, is that all those who don't agree would just shut up. It would be more peaceful, but for the reasons stated by Nathaniel, I think that would be a very bad outcome. Best, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] What is consensus anyway
Hi, On Tue, Apr 24, 2012 at 2:25 PM, Charles R Harris wrote: > > > 2012/4/24 Stéfan van der Walt >> >> On Tue, Apr 24, 2012 at 11:12 AM, Charles R Harris >> wrote: >> > The advantage of nans, I suppose, is that they are in the hardware and >> > so >> >> Why are we having a discussion on NAN's in a thread on consensus? >> This is a strong indicator of the problem we're facing. >> > > We seem to have a consensus regarding interest in the topic. This email is mainly to Travis. This thread seems to be dying, condemning us to keep repeating the same conversation with no result. Chuck has made it clear he is not interested in this conversation. Until it is clear you are interested in this conversation, it will keep dying. As you know, I think that will be very bad for numpy, and, as you know, I care a great deal about that. So, please, if you care about this, and agree that something should be done, please, say so, and if you don't agree something should be done, say so. It can't better without your help, See you, Matthew ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] What is consensus anyway
Hi, On Tue, Apr 24, 2012 at 6:12 PM, Charles R Harris wrote: > > > On Tue, Apr 24, 2012 at 6:56 PM, Nathaniel Smith wrote: >> >> On Tue, Apr 24, 2012 at 2:14 PM, Charles R Harris >> wrote: >> > >> > >> > On Mon, Apr 23, 2012 at 11:35 PM, Fernando Perez >> > wrote: >> >> >> >> On Mon, Apr 23, 2012 at 8:49 PM, Stéfan van der Walt >> >> wrote: >> >> > If you are referring to the traditional concept of a fork, and not to >> >> > the type we frequently make on GitHub, then I'm surprised that no one >> >> > has objected already. What would a fork solve? To paraphrase the >> >> > regexp saying: after forking, we'll simply have two problems. >> >> >> >> I concur with you here: github 'forks', yes, as many as possible! >> >> Hopefully every one of those will produce one or more PRs :) But a >> >> fork in the sense of a divergent parallel project? I think that would >> >> only be indicative of a complete failure to find a way to make >> >> progress here, and I doubt we're anywhere near that state. >> >> >> >> That forks are *possible* is indeed a valuable and important option in >> >> open source software, because it means that a truly dysfunctional >> >> original project team/direction can't hold a community hostage >> >> forever. But that doesn't mean that full-blown forks should be >> >> considered lightly, as they also carry enormous costs. >> >> >> >> I see absolutely nothing in the current scenario to even remotely >> >> consider that a full-blown fork would be a good idea, and I hope I'm >> >> right. It seems to me we're making progress on problems that led to >> >> real difficulties last year, but from multiple parties I see signs >> >> that give me reason to be optimistic that the project is getting >> >> better, not worse. >> >> >> > >> > We certainly aren't there at the moment, but I can see us heading that >> > way. >> > But let's back up a bit. Numpy 1.6.0 came out just about 1 year ago. >> > Since >> > then datetime, NA, polynomial work, and various other enhancements have >> > gone >> > in along with some 280 bug fixes. The major technical problem blocking a >> > 1.7 >> > release is getting datetime working reliably on windows. So I think that >> > is >> > where the short term effort needs to be. Meanwhile, we are spending >> > effort >> > to get out a 1.6.2 just so people can work with a stable version with >> > some >> > of the bug fixes, and potentially we will spend more time and effort to >> > pull >> > out the NA code. In the future there may be a transition to C++ and >> > eventually a break with the current ABI. Or not. >> > >> > There are at least two motivations that get folks to write code for open >> > source projects, scratching an itch and money. Money hasn't been a big >> > part >> > of the Numpy picture so far, so that leaves scratching an itch. One of >> > the >> > attractions of Numpy is that it is a small project, BSD licensed, and >> > not >> > overburdened with governance and process. This makes scratching an itch >> > not >> > as difficult as it would be in a large project. If Numpy remains a small >> > project but acquires the encumbrances of a big project much of that >> > attraction will be lost. Momentum and direction also attracts people, >> > but >> > numpy is stalled at the moment as the whole NA thing circles around once >> > again. >> >> I don't think we need a fork, or to start maintaining separate stable >> and unstable trees, or any of the other complicated process changes >> that have been suggested. There are tons of projects that routinely >> make much bigger changes than we're talking about, and they do it >> without needing that kind of overhead. I know that these suggestions >> are all made in good faith, but they remind me of a line from that >> Apache page I linked earlier: "People tend to avoid conflict and >> thrash around looking for something to substitute - somebody in >> charge, a rule, a process, stagnation. None of these tend to be very >> good substitutes for doing the hard work of resolving the conflict." >> >> I also think if you talk to potential contributors, you'll find that >> clear, simple processes and a history of respecting everyone's input >> are much more attractive than a no-rules free-for-all. Good >> engineering practices are not an "encumbrance". Resolving conflicts >> before merging is a good engineering practice. >> >> What happened with the NA discussion is this: >> - There was substantial disagreement about whether NEP-style masks, >> or indeed, focusing on a mask-based implementation *at all*, was the >> best way forward. >> - There was also a perceived time constraint, that we had to either >> implement something immediately while Mark was there, or have nothing. >> >> So in the end, the latter concern outweighed the former, the >> discussion was cut off, and Mark's best guess at an API was merged >> into master. I totally understand how this decision made sense at the >> time, but the result is what we see now: it's