Re: [Numpy-discussion] The end of numpy as we know it ?

2012-02-18 Thread Benjamin Root
On Saturday, February 18, 2012, Sturla Molden wrote:

>
>
> Den 18. feb. 2012 kl. 17:12 skrev Alan G Isaac 
> 
> >:
>
> >
> >
> > How does "stream-lined" code written for maintainability
> > (i.e., with helpful comments and tests) become *less*
> > accessible to amateurs??
>
>
> I think you missed the irony.
>
> Sturla


Took me couple reads.  Must be too early in the morning for me.

For those who needs a clue, the last few lines seem to suggest that the
only way forward is to relicense numpy so that it could be sold.  This is
obviously ridiculous and a give-away to the fact that everything else in
the email was sarcastic.

Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Proposed Roadmap Overview

2012-02-18 Thread Benjamin Root
On Sat, Feb 18, 2012 at 2:45 PM, Charles R Harris  wrote:

>
>
> On Sat, Feb 18, 2012 at 1:39 PM, Matthew Brett wrote:
>
>> Hi,
>>
>> On Sat, Feb 18, 2012 at 12:35 PM, Charles R Harris
>>  wrote:
>> >
>> >
>> > On Sat, Feb 18, 2012 at 12:21 PM, Matthew Brett <
>> matthew.br...@gmail.com>
>> > wrote:
>> >>
>> >> Hi.
>> >>
>> >> On Sat, Feb 18, 2012 at 12:18 AM, Christopher Jordan-Squire
>> >>  wrote:
>> >> > On Fri, Feb 17, 2012 at 11:31 PM, Matthew Brett
>> >> >  wrote:
>> >> >> Hi,
>> >> >>
>> >> >> On Fri, Feb 17, 2012 at 10:18 PM, Christopher Jordan-Squire
>> >> >>  wrote:
>> >> >>> On Fri, Feb 17, 2012 at 8:30 PM, Sturla Molden 
>> >> >>> wrote:
>> >> 
>> >> 
>> >>  Den 18. feb. 2012 kl. 05:01 skrev Jason Grout
>> >>  :
>> >> 
>> >> > On 2/17/12 9:54 PM, Sturla Molden wrote:
>> >> >> We would have to write a C++ programming tutorial that is based
>> on
>> >> >> Pyton knowledge instead of C knowledge.
>> >> >
>> >> > I personally would love such a thing.  It's been a while since I
>> did
>> >> > anything nontrivial on my own in C++.
>> >> >
>> >> 
>> >>  One example: How do we code multiple return values?
>> >> 
>> >>  In Python:
>> >>  - Return a tuple.
>> >> 
>> >>  In C:
>> >>  - Use pointers (evilness)
>> >> 
>> >>  In C++:
>> >>  - Return a std::tuple, as you would in Python.
>> >>  - Use references, as you would in Fortran or Pascal.
>> >>  - Use pointers, as you would in C.
>> >> 
>> >>  C++ textbooks always pick the last...
>> >> 
>> >>  I would show the first and the second method, and perhaps
>> >>  intentionally forget the last.
>> >> 
>> >>  Sturla
>> >> 
>> >> >>
>> >> >>> On the flip side, cython looked pretty...but I didn't get the
>> >> >>> performance gains I wanted, and had to spend a lot of time figuring
>> >> >>> out if it was cython, needing to add types, buggy support for
>> numpy,
>> >> >>> or actually the algorithm.
>> >> >>
>> >> >> At the time, was the numpy support buggy?  I personally haven't had
>> >> >> many problems with Cython and numpy.
>> >> >>
>> >> >
>> >> > It's not that the support WAS buggy, it's that it wasn't clear to me
>> >> > what was going on and where my performance bottleneck was. Even after
>> >> > microbenchmarking with ipython, using timeit and prun, and using the
>> >> > cython code visualization tool. Ultimately I don't think it was
>> >> > cython, so perhaps my comment was a bit unfair. But it was
>> >> > unfortunately difficult to verify that. Of course, as you say,
>> >> > diagnosing and solving such issues would become easier to resolve
>> with
>> >> > more cython experience.
>> >> >
>> >> >>> The C files generated by cython were
>> >> >>> enormous and difficult to read. They really weren't meant for human
>> >> >>> consumption.
>> >> >>
>> >> >> Yes, it takes some practice to get used to what Cython will do, and
>> >> >> how to optimize the output.
>> >> >>
>> >> >>> As Sturla has said, regardless of the quality of the
>> >> >>> current product, it isn't stable.
>> >> >>
>> >> >> I've personally found it more or less rock solid.  Could you say
>> what
>> >> >> you mean by "it isn't stable"?
>> >> >>
>> >> >
>> >> > I just meant what Sturla said, nothing more:
>> >> >
>> >> > "Cython is still 0.16, it is still unfinished. We cannot base NumPy
>> on
>> >> > an unfinished compiler."
>> >>
>> >> Y'all mean, it has a zero at the beginning of the version number and
>> >> it is still adding new features?  Yes, that is correct, but it seems
>> >> more reasonable to me to phrase that as 'active development' rather
>> >> than 'unstable', because they take considerable care to be backwards
>> >> compatible, have a large automated Cython test suite, and a major
>> >> stress-tester in the Sage test suite.
>> >>
>> >
>> > Matthew,
>> >
>> > No one in their right mind would build a large performance library using
>> > Cython, it just isn't the right tool. For what it was designed for -
>> > wrapping existing c code or writing small and simple things close to
>> Python
>> > - it does very well, but it was never designed for making core C/C++
>> > libraries and in that role it just gets in the way.
>>
>> I believe the proposal is to refactor the lowest levels in pure C and
>> move the some or most of the library superstructure to Cython.
>>
>
> Go for it.
>
> Chuck
>
>
>
Just a couple of quick questions:

1.) What is the status of the refactoring that was done for IronPython a
couple of years ago?  The last I heard, the branches diverged too much for
merging the work back into numpy.  Are there lessons that can be learned
from that experience that can be applied to whatever happens next?

2.) My personal preference is an incremental refactor over to C++ using
STL, however, I have to be realistic.  First, the exception issue is
problematic (unsolvable? I don't know). Second, one of Numpy/Scipy's
greatest strengths is the relative ease 

Re: [Numpy-discussion] Proposed Roadmap Overview

2012-02-18 Thread Benjamin Root
On Saturday, February 18, 2012, Matthew Brett wrote:

> Hi,
>
> On Sat, Feb 18, 2012 at 8:38 PM, Travis Oliphant 
> >
> wrote:
>
> > We will need to see examples of what Mark is talking about and clarify
> some
> > of the compiler issues.   Certainly there is some risk that once code is
> > written that it will be tempting to just use it.   Other approaches are
> > certainly worth exploring in the mean-time, but C++ has some strong
> > arguments for it.
>
> The worry as I understand it is that a C++ rewrite might make the
> numpy core effectively a read-only project for anyone but Mark.  Do
> you have any feeling for whether that is likely?
>
>
Dude, have you seen the .c files in numpy/core? They are already read-only
for pretty much everybody but Mark.

All kidding aside, is your concern that when Mark starts this that no one
will be able to contribute until he is done? I can tell you right now that
won't be the case as I will be trying to flesh out issues with datetime64
with him.

Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Possible roadmap addendum: building better text file readers

2012-02-23 Thread Benjamin Root
On Thu, Feb 23, 2012 at 3:14 PM, Robert Kern  wrote:

> On Thu, Feb 23, 2012 at 21:09, Gael Varoquaux
>  wrote:
> > On Thu, Feb 23, 2012 at 04:07:04PM -0500, Wes McKinney wrote:
> >> In this last case for example, around 500 MB of RAM is taken up for an
> >> array that should only be about 80-90MB. If you're a data scientist
> >> working in Python, this is _not good_.
> >
> > But why, oh why, are people storing big data in CSV?
>
> Because everyone can read it. It's not so much "storage" as "transmission".
>
>
Because their labmate/officemate/advisor is using Excel...

Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] bincount([], minlength=2) should work right?

2012-02-25 Thread Benjamin Root
On Saturday, February 25, 2012, Alan G Isaac wrote:

> On 2/25/2012 4:44 PM, James Bergstra wrote:
> > bincount([]) makes no sense,
>
> I disagree:
> http://permalink.gmane.org/gmane.comp.python.numeric.general/42041
>
>
> > but if a minlength argument is provided,
> > then the routine should succeed.
>
> Definitely!
>
> Alan Isaac


I thought we already fixed this? Or was that only for histogram?

Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Floating point "close" function?

2012-03-03 Thread Benjamin Root
On Saturday, March 3, 2012, Robert Kern  wrote:
> On Sat, Mar 3, 2012 at 14:31, Ralf Gommers 
wrote:
>>
>>
>> On Sat, Mar 3, 2012 at 3:05 PM, Robert Kern 
wrote:
>>>
>>> On Sat, Mar 3, 2012 at 13:59, Ralf Gommers 
>>> wrote:
>>> >
>>> >
>>> > On Thu, Mar 1, 2012 at 11:44 PM, Joe Kington 
wrote:
>>> >>
>>> >> Is there a numpy function for testing floating point equality that
>>> >> returns
>>> >> a boolean array?
>>> >>
>>> >> I'm aware of np.allclose, but I need a boolean array.  Properly
>>> >> handling
>>> >> NaN's and Inf's (as allclose does) would be a nice bonus.
>>> >>
>>> >> I wrote the function below to do this, but I suspect there's a method
>>> >> in
>>> >> numpy that I missed.
>>> >
>>> >
>>> > I don't think such a function exists, would be nice to have. How about
>>> > just
>>> > adding a keyword "return_array" to allclose to do so?
>>>
>>> As a general design principle, adding a boolean flag that changes the
>>> return type is worse than making a new function.
>>
>>
>> That's certainly true as a general principle. Do you have a concrete
>> suggestion in this case though?
>
> np.close()
>

When I read that, I mentally think of "close" as in closing a file.  I
think we need a synonym.

Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] all elements equal

2012-03-05 Thread Benjamin Root
On Mon, Mar 5, 2012 at 1:44 PM, Keith Goodman  wrote:

> On Mon, Mar 5, 2012 at 11:36 AM,   wrote:
> > How about numpy.ptp, to follow this line? I would expect it's single
> > pass, but wouldn't short circuit compared to cython of Keith
>
> I[1] a = np.ones(10)
> I[2] timeit (a == a[0]).all()
> 1000 loops, best of 3: 203 us per loop
> I[3] timeit a.min() == a.max()
> 1 loops, best of 3: 106 us per loop
> I[4] timeit np.ptp(a)
> 1 loops, best of 3: 106 us per loop
>
> I[5] a[1] = 9
> I[6] timeit (a == a[0]).all()
> 1 loops, best of 3: 89.7 us per loop
> I[7] timeit a.min() == a.max()
> 1 loops, best of 3: 102 us per loop
> I[8] timeit np.ptp(a)
> 1 loops, best of 3: 103 us per loop
>

Another issue to watch out for is if the array is empty.  Technically
speaking, that should be True, but some of the solutions offered so far
would fail in this case.

Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Missing data again

2012-03-07 Thread Benjamin Root
On Wed, Mar 7, 2012 at 1:26 PM, Nathaniel Smith  wrote:

> On Wed, Mar 7, 2012 at 5:17 PM, Charles R Harris
>  wrote:
> > On Wed, Mar 7, 2012 at 9:35 AM, Pierre Haessig  >
> >> Coming back to Travis proposition "bit-pattern approaches to missing
> >> data (*at least* for float64 and int32) need to be implemented.", I
> >> wonder what is the amount of extra work to go from nafloat64 to
> >> nafloat32/16 ? Is there an hardware support NaN payloads with these
> >> smaller floats ? If not, or if it is too complicated, I feel it is
> >> acceptable to say "it's too complicated" and fall back to mask. One may
> >> have to choose between fancy types and fancy NAs...
> >
> > I'm in agreement here, and that was a major consideration in making a
> > 'masked' implementation first.
>
> When it comes to "missing data", bitpatterns can do everything that
> masks can do, are no more complicated to implement, and have better
> performance characteristics.
>
>
Not true.  bitpatterns inherently destroys the data, while masks do not.
For matplotlib, we can not use bitpatterns because it could over-write user
data (or we have to copy the data).  I would imagine other extension
writers would have similar issues when they need to play around with input
data in a safe manner.

Also, I doubt that the performance characteristics for strings and integers
are the same as it is for masks.

Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] use for missing (ignored) data?

2012-03-07 Thread Benjamin Root
On Wednesday, March 7, 2012, Neal Becker  wrote:
> Charles R Harris wrote:
>
>> On Wed, Mar 7, 2012 at 1:05 PM, Neal Becker  wrote:
>>
>>> I'm wondering what is the use for the ignored data feature?
>>>
>>> I can use:
>>>
>>> A[valid_A_indexes] = whatever
>>>
>>> to process only the 'non-ignored' portions of A.  So at least some
simple
>>> cases
>>> of ignored data are already supported without introducing a new type.
>>>
>>> OTOH:
>>>
>>> w = A[valid_A_indexes]
>>>
>>> will copy A's data, and subsequent use of
>>>
>>> w[:] = something
>>>
>>> will not update A.
>>>
>>> Is this the reason for wanting the ignored data feature?
>>>
>>
>> Suppose you are working with plotted data and want to turn points on/off
by
>> clicking on them interactively to see how that affects a fit. Why make
>> multiple copies, change sizes, destroy data, and all that nonsense? Just
>> have the click update the mask and redraw.
>>
>> Chuck
>
> But does
>
> some_func (A[valid_data_mask])
>
> actually perform a copy?
>
>

Yes! If it isn't sliced, or accessed by a scalar index, then you are given
a copy.  Fancy indexing and Boolean indexing will not return a view.

Note that assignments to a Boolean-indexed array by a scalar is
special-cased. I.e.,

A[valid_points] = 5

will do what you expect. But,

A[valid_points] += 5

may not, IIRC.

Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] use for missing (ignored) data?

2012-03-07 Thread Benjamin Root
On Wednesday, March 7, 2012, Nathaniel Smith  wrote:
> On Wed, Mar 7, 2012 at 8:05 PM, Neal Becker  wrote:
>> I'm wondering what is the use for the ignored data feature?
>>
>> I can use:
>>
>> A[valid_A_indexes] = whatever
>>
>> to process only the 'non-ignored' portions of A.  So at least some
simple cases
>> of ignored data are already supported without introducing a new type.
>>
>> OTOH:
>>
>> w = A[valid_A_indexes]
>>
>> will copy A's data, and subsequent use of
>>
>> w[:] = something
>>
>> will not update A.
>>
>> Is this the reason for wanting the ignored data feature?
>
> Hi Neal,
>
> There are a few reasons that I know of why people want more support
> from numpy for ignored data/masks, specifically (as opposed to missing
> data or other related concepts):
>
> 1) If you're often working on some subset of your data, then it's
> convenient to set the mask once and have it stay in effect for further
> operations. Anything you can accomplish this way can also be
> accomplished by keeping an explicit mask array and using it for
> indexing "by hand", but in some situations it may be more convenient
> not to.
>
> 2) Operating on subsets of an array without making a copy. Like
> Benjamin pointed out, indexing with a mask makes a copy. This is slow,
> and what's worse, people who work with large data sets (e.g., big fMRI
> volumes) may not have enough memory to afford such a copy. This
> problem can be solved by using the new where= argument to ufuncs
> (which skips the copy). (But then see (1) -- passing where= to a bunch
> of functions takes more typing than just setting it once and leaving
> it.)
>
> 3) Suppose there's a 3rd-party function that takes an array --
> borrowing Charles example, say it's draw_points(arr). Now you want to
> apply it to just a subset of your data, and want to avoid a copy. It
> would be nice if the original author had made it draw_points(arr,
> mask), but they didn't. Well, if you have masking "built in" to your
> array type, then maybe you can call this as draw_points(masked_arr)
> and it will Just Work. I.e., maybe people who aren't thinking about
> masking will sometimes write code that accidentally works with masking
> anyway. I'm not sure how much I'd trust this, but I guess it's nice
> when it happens. And if it does work, then implementing the show/hide
> point functionality will be easier. (And if it doesn't work, and
> masking is built into numpy.ndarray, then maybe you can use this to
> argue with the original author that this is a bug, not just a missing
> feature. Again, I'm not sure if this is a good thing on net: one could
> argue that people shouldn't be forced to think about masking every
> time they write any function, just in case it becomes relevant later.
> But certainly it'd be useful sometimes.)
>
> There may be other motivations that I'm not aware of, of course.
>
> -- Nathaniel
>

I think you got most of the motivations right. I would say on the last
point that extension authors should be able to say "does not support NA!".
 The important thing is that it makes it more up-front.

An additional motivation is with regards to mathematical operations.
 Personally, I hate getting bitten by a function that takes a max(), and I
have a NaN in the array.  In addition, what about adding two arrays
together that may or may not have different masks?  This has been the major
advantage of no.ma.  All of Mostly Works.

Cheers!
Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] draft enum NEP

2012-03-15 Thread Benjamin Root
On Thursday, March 15, 2012, Nathaniel Smith  wrote:
> On Wed, Mar 14, 2012 at 1:44 AM, Mark Wiebe  wrote:
>> On Fri, Mar 9, 2012 at 8:55 AM, Bryan Van de Ven 
>> wrote:
>>>
>>> Hi all,
>>>
>>> I have started working on a NEP for adding an enumerated type to NumPy.
>>> It is on my GitHub:
>>>
>>> https://github.com/bryevdv/numpy/blob/enum/doc/neps/enum.rst
>>>
>>> It is still very rough, and incomplete in places. But I would like to
>>> get feedback sooner rather than later in order to refine it. In
>>> particular there are a few questions inline in the document that I would
>>> like input on. Any comments, suggestions, questions, concerns, etc. are
>>> very welcome.
>>
>>
>> This looks like a great start to me.
>>
>> I think the open/closed enum distinction will need to be explored a
little
>> bit more, because it interacts with dtype immutability/hashability. Do
you
>> know if there are any examples of Python objects in the wild that
>> dynamically convert from not being hashable (i.e. raising an exception if
>> used as a dict key) to become hashable?
>
> I haven't run into any...
>
> Thinking about it, I'm not sure I have any use case for this type
> being mutable. Maybe someone else can think of one? The first case
> that came to mind was in reading a large text file, where you want to
> (1) auto-create an enum, (2) use a pre-allocated array, and (3) don't
> know ahead of time what the levels are:
>
>  a = np.empty(lines_in_file, dtype=np.dtype(Enum()))
>  for i, line in enumerate(f):
>field = line.split()[0]
>a.dtype.add_level(field)
>a[i] = field
>  a.dtype.seal()
>
> But really this is just can be done just as easily and efficiently
> without a mutable dtype:
>
>  a = np.empty(lines_in_file, dtype=np.int32)
>  intern_table = {}
>  next_level = 0
>  for i, line in enumerate(f):
>field = line.split()[0]
>val = intern_table.setdefault(field, next_level)
>if val == next_level:
>  next_level += 1
>a[i] = val
>  a = a.view(dtype=np.dtype(Enum(map=intern_table)))
>
> I notice that the HDF5 C library has a concept of open versus closed
> enums, but I can't tell from the documentation at hand why this is; it
> looks like it might just be a limitation of the implementation. (Like,
> a workaround for C's lack of a standard mapping type, which makes it
> inconvenient to pass in all the mappings in to a single API call.)
>
>> It might be worth adding a section which briefly compares and contrasts
the
>> proposed functionality with enums in various programming languages. Here
are
>> two links I found to try and get an idea:
>>
>> MS on C# enum usage:
>> http://msdn.microsoft.com/en-us/library/cc138362.aspx
>> Wikipedia on C++ enum class:
>> http://en.wikipedia.org/wiki/C%2B%2B11#Strongly_typed_enumerations
>>
>> For example, the C# enum has a way to enable a "flags" mode, which will
>> create successive powers of 2. This may not be a feature NumPy needs,
but if
>> people are finding it useful in C#, maybe it would be useful here too.
>
> There's also a long, ongoing debate about how to do enums in Python --
e.g.:
>  http://www.python.org/dev/peps/pep-0354/
>  http://pypi.python.org/pypi/enum/
>  http://pypi.python.org/pypi/enum_meta/
>  http://pypi.python.org/pypi/flufl.enum/
>  http://pypi.python.org/pypi/lazr.enum/
>  http://pypi.python.org/pypi/pyutilib.enum/
>  http://pypi.python.org/pypi/coding/
>
http://stackoverflow.com/questions/36932/whats-the-best-way-to-implement-an-enum-in-python
> I guess Guido likes flufl.enum:
>  http://mail.python.org/pipermail/python-ideas/2011-July/010909.html
>
> BUT, I'm not sure any of this is relevant at all. "Enums" are a
> programming language feature that are, first and foremost, about
> injecting names into your code's namespace. What I'm hoping to see is
> a dtype for holding categorical data, similar to an R "factor"
>  http://stat.ethz.ch/R-manual/R-devel/library/base/html/factor.html
>  https://svn.r-project.org/R/trunk/src/library/base/R/factor.R (NB:
> This is GPL code if anyone is paranoid about contamination, but also
> the most complete API description available)
> or an HDF5 "enum"
>  http://www.hdfgroup.org/HDF5/doc/H5.user/Datatypes.html#Datatypes_Enum
> I believe pandas has some functionality along these lines too, though
> I can't find it in the online docs -- hopefully Wes will fill us in.
>
> These are basically objects that act for most purposes like string
> arrays, but in which all strings are required to come from a finite,
> specified list. This list acts like some metadata attached to the
> array; it's order may or may not be significant. And they're
> implemented internally as integer arrays.
>
> I'm not sure what it would even mean to treat this kind of data as
> "flags", since you can't take the bitwise-or of two strings...
>
> -- Nathaniel
>

I guess my problem is that this isn't _quite_ like an enum that I am
familiar with (but not quite unlike it either).  Should we call it
"factor", to avoid confusion or ar

Re: [Numpy-discussion] float96 on windows32 is float64?

2012-03-15 Thread Benjamin Root
On Thursday, March 15, 2012, Charles R Harris 
wrote:
>
>
> On Thu, Mar 15, 2012 at 10:17 PM, David Cournapeau 
wrote:
>>
>>
>> On Thu, Mar 15, 2012 at 11:10 PM, Matthew Brett 
wrote:
>>>
>>> Hi,
>>>
>>> Am I right in thinking that float96 on windows 32 bit is a float64
>>> padded to 96 bits?
>>
>> Yes
>>
>>>
>>>  If so, is it useful?
>>
>> Yes: this is what allows you to use dtype to parse complex binary files
directly in numpy without having to care so much about those details. And
that's how it is defined on windows in any case (C standard only forces you
to have sizeof(long double) >= sizeof(double)).
>>
>>>
>>>  Has anyone got a windows64
>>> box to check float128 ?
>>
>> Too lazy to check on my vm, but I am pretty sure it is 16 bytes on
windows 64.
>
> Wait, MSVC doesn't support extended precision, so how do we get doubles
padded to 96 bits? I think MINGW supports extended precision but the MS
libraries won't. Still, if it's doubles it should be 64 bits and float96
shouldn't exist. Doubles padded to 96 bits are 150% pointless.
>
> Chuck
>

There is a Microsoft joke in there, somewhere...

Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] numpy installation problem?

2012-03-18 Thread Benjamin Root
On Sunday, March 18, 2012,   wrote:
> Dear list,
> I am having problems installing matplotlib (from source) and fipy.
> I had installed numpy from source and it is running well:
> :~$ python -c "import numpy; print numpy.__version__"
> 1.6.1
> After being trying to solve this problem on matplotlib list, I was
> recommended to ask numpy list.
> I copy-paste the errors below and forward the mails from matplotlib users.
>
> ...
> 
> creating /usr/local/lib/python2.7/dist-packages/FiPy-2.1.2-py2.7.egg
> Extracting FiPy-2.1.2-py2.7.egg to /usr/local/lib/python2.7/dist-packages
> FiPy 2.1.2 is already the active version in easy-install.pth
>
> Installed /usr/local/lib/python2.7/dist-packages/FiPy-2.1.2-py2.7.egg
> Processing dependencies for FiPy==2.1.2
> Finished processing dependencies for FiPy==2.1.2
> Traceback (most recent call last):
>  File "setup.py", line 594, in 
>__import__(pkg)
>  File
>
"/usr/local/lib/python2.7/dist-packages/pysparse-1.2_dev224-py2.7-linux-x86_64.egg/pysparse/__init__.py",
> line 6, in 
>from numpy._import_tools import PackageLoader
>  File "/usr/local/lib/python2.7/dist-packages/numpy/__init__.py", line
> 137, in 
>import add_newdocs
>  File "/usr/local/lib/python2.7/dist-packages/numpy/add_newdocs.py", line
> 9, in 
>from numpy.lib import add_newdoc
>  File "/usr/local/lib/python2.7/dist-packages/numpy/lib/__init__.py",
> line 13, in 
>from polynomial import *
>  File "/usr/local/lib/python2.7/dist-packages/numpy/lib/polynomial.py",
> line 11, in 
>import numpy.core.numeric as NX
> AttributeError: 'module' object has no attribute 'core'
> :~/FiPy-2.1.2$
>
>
> :~/matplotlib$ sudo python setup.py install
> basedirlist is: ['/usr/local', '/usr']
>

> BUILDING MATPLOTLIB
>matplotlib: 1.2.x
>python: 2.7.2+ (default, Oct  4 2011, 20:06:09)  [GCC
4.6.1]
>  platform: linux2
>
> REQUIRED DEPENDENCIES
> numpy: no
>* You must install numpy 1.4 or later to build
>* matplotlib.
>
>
> I do not understand why matplotlib can not see numpy. Please, see the
> forwarded message. My notebook description:
> Linux 3.0.0-13-generic #22-Ubuntu SMP Wed Nov 2 13:27:26 UTC 2011 x86_64
> x86_64 x86_64 GNU/Linux
> Thank in advance for your help
> Regards,
> Lucia
>

Just making sure, where/when did you get your mpl source?  There was a bug
in mpl master for a while that would not parse numpy's development version
number correctly, but it has been fixed since then.

Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Slice specified axis

2012-04-06 Thread Benjamin Root
On Friday, April 6, 2012, Val Kalatsky wrote:

>
> The only slicing short-cut I can think of is the Ellipsis object, but it's
> not going to help you much here.
> The alternatives that come to my mind are (1) manipulation of shape
> directly and (2) building a string and running eval on it.
> Your solution is better than (1), and (2) is a horrible hack, so your
> solution wins again.
> Cheers
> Val
>

Take a peek at how np.gradient() does it.  It creates a list of None with a
length equal to the number of dimensions, and then inserts a slice object
in the appropriate spot in the list.

Cheers!
Ben Root

>
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Slice specified axis

2012-04-09 Thread Benjamin Root
On Mon, Apr 9, 2012 at 12:14 PM, Jonathan T. Niehof wrote:

> On 04/06/2012 06:54 AM, Benjamin Root wrote:
>
> > Take a peek at how np.gradient() does it. It creates a list of None with
> > a length equal to the number of dimensions, and then inserts a slice
> > object in the appropriate spot in the list.
>
> List of slice(None), correct? At least that's what I see in the source,
> and:
>
>  >>> a = numpy.array([[1,2],[3,4]])
>  >>> operator.getitem(a, (None, slice(1, 2)))
> array([[[3, 4]]])
>  >>> operator.getitem(a, (slice(None), slice(1, 2)))
> array([[2],
>[4]])
>
>
Correct, sorry, I was working from memory.

Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Why is numpy.abs so much slower on complex64 than complex128 under windows 32-bit?

2012-04-10 Thread Benjamin Root
On Tue, Apr 10, 2012 at 12:57 PM, Francesc Alted wrote:

> On 4/10/12 9:55 AM, Henry Gomersall wrote:
> > On 10/04/2012 16:36, Francesc Alted wrote:
> >> In [10]: timeit c = numpy.complex64(numpy.abs(numpy.complex128(b)))
> >> 100 loops, best of 3: 12.3 ms per loop
> >>
> >> In [11]: timeit c = numpy.abs(b)
> >> 100 loops, best of 3: 8.45 ms per loop
> >>
> >> in your windows box and see if they raise similar results?
> >>
> > No, the results are somewhat the same as before - ~40ms for the first
> > (upcast/downcast) case and ~150ms for the direct case (both *much*
> > slower than yours!). This is versus ~28ms for operating directly on
> > double precisions.
>
> Okay, so it seems that something is going on wrong with the performance
> of pure complex64 abs() for Windows.
>
> >
> > I'm using numexpr in the end, but this is slower than numpy.abs under
> linux.
>
> Oh, you mean the windows version of abs(complex64) in numexpr is slower
> than a pure numpy.abs(complex64) under linux?  That's weird, because
> numexpr has an independent implementation of the complex operations from
> NumPy machinery.  Here it is how abs() is implemented in numexpr:
>
> static void
> nc_abs(cdouble *x, cdouble *r)
> {
> r->real = sqrt(x->real*x->real + x->imag*x->imag);
> r->imag = 0;
> }
>
> [as I said, only the double precision version is implemented, so you
> have to add here the cost of the cast too]
>
> Hmm, considering all of these facts, it might be that sqrtf() on windows
> is under-performing?  Can you try this:
>
> In [68]: a = numpy.linspace(0, 1, 1e6)
>
> In [69]: b = numpy.float32(a)
>
> In [70]: timeit c = numpy.sqrt(a)
> 100 loops, best of 3: 5.64 ms per loop
>
> In [71]: timeit c = numpy.sqrt(b)
> 100 loops, best of 3: 3.77 ms per loop
>
> and tell us the results for windows?
>
> PD: if you are using numexpr on windows, you may want to use the MKL
> linked version, which uses the abs of MKL, that should have considerably
> better performance.
>
> --
> Francesc Alted
>
>
Just a quick aside, wouldn't the above have overflow issues?  Isn't this
why hypot() is available?

Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Slice specified axis

2012-04-10 Thread Benjamin Root
On Tue, Apr 10, 2012 at 12:52 PM, Jonathan T. Niehof wrote:

> On 04/09/2012 09:11 PM, Tony Yu wrote:
>
> > I guess I wasn't reading very carefully and assumed that you meant a
> > list of `slice(None)` instead of a list of `None`.
>
> My apologies to Ben...I wasn't being pedantic to be a jerk, I was being
> pedantic because I read Ben's message and thought "oooh, that works?"
> and ran off to try it, since I'd just been writing some very similar
> code. And sadly, it doesn't.
>
>
No offense taken.  Such mistakes should be pointed out so that future
readers of the mailing list archives will have the correct information
available to them.  Bad mailing list comments are just as bad as outdated
source code comments.

Cheers!
Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] partial computations

2012-04-12 Thread Benjamin Root
On Wed, Apr 11, 2012 at 11:38 PM, santhu kumar  wrote:

> Hello all,
>
> I am trying to optimise a code and want your suggestions.
> A : - NX3 matrix (coordinates of N points)
>
> After performing pairwise distance computations(called pdist) between
> these points, depending upon a condition that the distance is in, I would
> perform further computations.
> Most of the computations require schur products (element by element) of
> NXN matrices with each other and then computing either the coloumn sum or
> row sum.
>
> As N goes to be large, these computations are taking some time (0.7 secs)
> which is not much generally but since this is being called many times, it
> acts as a bottleneck.
> I want to leverage on the fact that many of the NXN computations are not
> going to be used, or would be set to zero (if the pdist is greater than
> some minimum distance).
>
> How do i achieve it ?? Is masked array the elegant solution? Would it save
> me time?
>
> Thanks
> Santhosh
>
>
>
You might want to consider using scipy.spatial's KDTree as a way to
efficiently find all points that are within a specified distance from each
other.  Then, using those pairs, load up a sparse array with only the
relevant pairs.  It should save in computation and memory as well.

Cheers!
Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] What is consensus anyway

2012-04-24 Thread Benjamin Root
On Tue, Apr 24, 2012 at 2:12 PM, Charles R Harris  wrote:

>
>
> On Tue, Apr 24, 2012 at 9:25 AM,  wrote:
>
>> On Tue, Apr 24, 2012 at 9:43 AM, Pierre Haessig
>>  wrote:
>> > Hi,
>> >
>> > Le 24/04/2012 15:14, Charles R Harris a écrit :
>> >>
>> >> a) All arrays should be implicitly masked, even if the mask isn't
>> >> initially allocated. The maskna keyword can then be removed, taking
>> >> with it the sense that there are two kinds of arrays.
>> >>
>> >
>> > From my lazy user perspective, having masked and non-masked arrays share
>> > the same "look and feel" would be a number one advantage over the
>> > existing numpy.ma arrays. I would like masked array to be as
>> transparent
>> > as possible.
>>
>> I don't have any opinion about internal implementation.
>>
>> But users needs to be aware of whether they have masked arrays or not.
>> Since many functions (most of scipy) wouldn't know how to handle NA
>> and don't do any checks, (and shouldn't in my opinion if the NA check
>> is costly). The result might be silently wrong numbers depending on
>> the implementation.
>>
>
> There should be a flag saying whether or not NA has been allocated and
> allocation happens when NA is assigned to an array item, so that should be
> fast. I don't think scipy currently deals with masked arrays in all areas,,
> so I believe that the same problem exists there and would also exist for
> missing data types. I think this sort of compatibility problem is worth a
> whole discussion by itself.
>
>
>>
>> >
>> >> b) There needs to be a distinction between missing and ignore. The
>> >> mechanism for this is already in place in the payload type, although
>> >> it isn't clear to me that that is uniformly used in all the NA code.
>> >> There is also a place for missing *and* ignored. Which leads to
>> >
>> > If the idea of having two payloads is to avoid a maximum of "skipna &
>> > friends" extra keywords, I would like it much. My feeling with my small
>> > experience with R is that I end up calling every function with a
>> > different magical set of keywords (na.rm, na.action, ... and I forgot).
>>
>> There is a reason for requiring the user to decide what to do about NA's.
>> Either we have utility functions/methods to help the user change the
>> arrays and treat NA's before calling a function, or the function needs
>> to ask the user what should be done about possible NAs.
>> Doing it automatically might only be useful for specialised packages.
>>
>>
> That's what the different payloads would do. I think the common use case
> would always have the ignore bit set. What are the other sorts of actions
> you are interested in, and should they be part of the functions in Numpy,
> such as mean and std, or should they rather implemented in stats packages
> that may be more specialized? I see numpy.ma currently used in the
> following spots in scipy:
>
>
Like you said, this whole issue probably should be in a separate
discussion, but I would like to point out here with my thoughts on default
payload.  If we don't have some sort of mechanism for flagging which
functions are NA-friendly or not, then it would be wise to have NA default
to NaN behavior.  If only to prevent bugs that mess up data from being
undetected.

That being said, the determination of NA payload is tricky.  Some functions
may need to react differently to an NA.  One that comes to mind is
np.gradient().  However, other functions may not need to do anything
because they depend entirely upon other functions that have already been
updated to support NA.

Cheers!
Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] What is consensus anyway

2012-04-24 Thread Benjamin Root
On Tue, Apr 24, 2012 at 3:23 PM, Stéfan van der Walt wrote:

> On Tue, Apr 24, 2012 at 11:12 AM, Charles R Harris
>  wrote:
> > The advantage of nans, I suppose, is that they are in the hardware and so
>
> Why are we having a discussion on NAN's in a thread on consensus?
> This is a strong indicator of the problem we're facing.
>
> Stéfan
>

Good catch!  Looks like we got off-track when the discussion talked about
forks.

Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] What is consensus anyway

2012-04-24 Thread Benjamin Root
On Tuesday, April 24, 2012, Matthew Brett wrote:

> Hi,
>
> On Tue, Apr 24, 2012 at 2:25 PM, Charles R Harris
> > wrote:
> >
> >
> > 2012/4/24 Stéfan van der Walt >
> >>
> >> On Tue, Apr 24, 2012 at 11:12 AM, Charles R Harris
> >> > wrote:
> >> > The advantage of nans, I suppose, is that they are in the hardware and
> >> > so
> >>
> >> Why are we having a discussion on NAN's in a thread on consensus?
> >> This is a strong indicator of the problem we're facing.
> >>
> >
> > We seem to have a consensus regarding interest in the topic.
>
> This email is mainly to Travis.
>
> This thread seems to be dying, condemning us to keep repeating the
> same conversation with no result.
>
> Chuck has made it clear he is not interested in this conversation.
> Until it is clear you are interested in this conversation, it will
> keep dying.   As you know, I think that will be very bad for numpy,
> and, as you know, I care a great deal about that.
>
> So, please, if you care about this, and agree that something should be
> done, please, say so, and if you don't agree something should be done,
> say so.  It can't better without your help,
>
> See you,
>
> Matthew
>

Matthew,

I agree with the general idea of consensus, and I think many of us here
agree with the ideal in principle. Quite frankly, I am not sure what more
 you want from us. You are only going to get so much leeway on a
philosophical discussion on goverance on a numerical computation mail list.
The thread keeps "dying" (i say it is getting distracted) because coders
are champing at the bit to get stuff done.

In a sense, i think there is a consensus, if you will, to move on.  All in
favor, say "Aye!"

Cheers!
Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] What is consensus anyway

2012-04-25 Thread Benjamin Root
On Wednesday, April 25, 2012, Matthew Brett wrote:

> Hi,
>
> On Wed, Apr 25, 2012 at 2:35 PM, Travis Oliphant 
> >
> wrote:
> >>
> >> Do you agree that Numpy has not been very successful in recruiting and
> >> maintaining new developers compared to its large user-base?
> >>
> >> Compared to - say - Sympy?
> >>
> >> Why do you think this is?
> >
> > I think it's mostly because it's infrastructure that is a means to an
> end.   I certainly wasn't excited to have to work on NumPy originally, when
> my main interest was SciPy.I've come to love the interesting plateau
> that NumPy lives on.But, I think it mostly does the job it is supposed
> to do. The fact that it is in C is also not very sexy.   It is also
> rather complicated with a lot of inter-related parts.
> >
> > I think NumPy could do much, much more --- but getting there is going to
> be a challenge of execution and education.
> >
> > You can get to know the code base.  It just takes some time and
> patience.   You also have to be comfortable with compilers and building
> software just to tweak the code.
> >
> >
> >>
> >> Would you consider asking that question directly on list and asking
> >> for the most honest possible answers?
> >
> > I'm always interested in honest answers and welcome any sincere
> perspective.
>
> Of course, there are potential explanations:
>
> 1) Numpy is too low-level for most people
> 2) The C code is too complicated
> 3) It's fine already, more or less
>
> are some obvious ones. I would say there are the easy answers. But of
> course, the easy answer may not be the right answer. It may not be
> easy to get right answer [1].   As you can see from Alan Isaac's reply
> on this thread, even asking the question can be taken as being in bad
> faith.  In that situation, I think you'll find it hard to get sincere
> replies.


As with anything, the phrasing of a question makes a world of a difference
with regards to replies. Ask any pollster.  When phrased correctly, I would
not have any doubt about the sincerely of replies, and I would not worry
about previewed hostility -- when phrased correctly. As the questioner, the
onus is upon you to gauge the community and adjust the question
appropriately.

I think the fact that we engage in these discussions show that we value and
care about each others perceptions and opinions with regards to numpy.

Cheers!
Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] What is consensus anyway

2012-04-25 Thread Benjamin Root
On Wednesday, April 25, 2012, Travis Oliphant wrote:

>
> On Apr 25, 2012, at 7:18 PM, josef.p...@gmail.com  wrote:
>
> >
> > Except for the big changes like NA and datetime, I think the debate is
> > pretty boring.
> > The main problem that I see for discussing technical issues is whether
> > there are many
> > developers really interested in commenting on code and coding.
> > I think it mostly comes down to the discussion on tickets or pull
> requests.
>
> This is a very insightful comment.   Github has been a great thing for
> both NumPy and SciPy.   However, it has changed the community feel for many
> because these pull request discussions don't happen on this list.
>
> You have to comment on a pull request to get notified of future comments
> or changes.The process is actually pretty nice, but it does mean you
> can't just hang out watching this list.  You have to look at the pull
> requests and get involved there.
>
> It would be nice if every pull request created a message to this list.
>  Is that even possible?
>
> -Travis
>
>
This ha been a concern of mine for matplotlib as well.  The closest I can
come is to set up an RSS feed, but all the titles are PR # and a action, so
I lose track of which ones I want to view.

All devs get an initial email for each PR, but I cant figure out how to get
that down to the public list and it is hard to know if another dev took
care of the PR or if it is just waiting.

Cheers!
Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] A crazy masked-array thought

2012-04-27 Thread Benjamin Root
On Fri, Apr 27, 2012 at 6:32 AM, Richard Hattersley
wrote:

> I know used a somewhat jokey tone in my original posting, but
> fundamentally it was a serious question concerning a live topic. So I'm
> curious about the lack of response. Has this all been covered before?
>
> Sorry if I'm being too impatient!
>
>
>
Richard,

Actually, I am rather surprised by the lack of response as well.  Actually,
this is quite unusual and I hope it doesn't sour you for more
contributions.  We do need more "crazy ideas" like your, if only just to
help break out of an infinite loop in a discussion.

Your idea is interesting, but doesn't it require C++?  Or maybe you are
thinking of creating a new C type object that would contain all the new
features and hold a pointer and function interface to the original POA.
Essentially, the new type would act as a wrapper around the original
ndarray?

Cheers!
Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Issue Tracking

2012-04-30 Thread Benjamin Root
On Monday, April 30, 2012, Travis Oliphant wrote:

> Hey all,
>
> We have been doing some investigation of various approaches to issue
> tracking.  The last time the conversation left this list was with
> Ralf's current list of preferences as:
>
> 1) Redmine
> 2) Trac
> 3) Github
>
> Since that time, Maggie who has been doing a lot of work settting up
> various issue tracking tools over the past couple of months, has set up a
> redmine instance and played with it.   This is a possibility as a future
> issue tracker.
>
> However, today I took a hard look at what the IPython folks are doing with
> their issue tracker and was very impressed by the level of community
> integration that having issues tracked by Github provides.Right now, we
> have a major community problem in that there are 3 conversations taking
> place (well at least 2 1/2).   One on Github, one on this list, and one on
> the Trac and it's accompanying wiki.
>
> I would like to propose just using Github's issue tracker.This just
> seems like the best move overall for us at this point.I like how the
> Pull Request mechanism integrates with the issue tracking.We could
> setup a Redmine instance but this would just re-create the same separation
> of communities that currently exists with the pull-requests, the mailing
> list, and the Trac pages.   Redmine is nicer than Trac, but it's still a
> separate space.   We need to make Github the NumPy developer hub and not
> have it spread throughout several sites.
>
> The same is true of SciPy.I think if SciPy also migrates to use Github
> issues, then together with IPython we can really be a voice that helps
> Github.   I will propose to NumFOCUS that the Foundation sponsor migration
> of the Trac to Github for NumPy and SciPy.If anyone would like to be
> involved in this migration project, please let me know.
>
> Comments, concerns?
>
> -Travis


Would it be possible to use the combined clout of the scipy packages as a
way to put some weight behind feature requests to github?

Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] record arrays initialization

2012-05-02 Thread Benjamin Root
On Wednesday, May 2, 2012, Stéfan van der Walt wrote:

> On Wed, May 2, 2012 at 4:46 PM, Kevin Jacobs 
> 
> >
> > wrote:
> > A FLANN implementation should be even faster--perhaps by as much as
> another
> > factor of two.
>
> I guess it depends on whether you care about the "Approximate" in
> "Fast Library for Approximate Nearest Neighbors".
>
> Stéfan


This is why I love following these lists!  I don't think I ever would have
come across this method on my own. Nifty!

Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Issue Tracking

2012-05-05 Thread Benjamin Root
On Saturday, May 5, 2012, Pauli Virtanen wrote:

> 05.05.2012 22:53, Ralf Gommers kirjoitti:
> [clip]
> > would be great to get it done by end of June.To Charles' list
> > and Ralf's suggestions, I would add setting up a server that can
> > relay pull requests to the mailing list.
> >
> > Don't know if you saw this, but it looks like Pauli is pretty far along
> > in fixing this problem:
> >
> http://thread.gmane.org/gmane.comp.python.numeric.general/49551/focus=49744
>
> The only thing missing is really only the server configuration ---
> new.scipy.org could in principle do that, but its mail system seems to
> be configured so that it cannot send mail to the MLs. So, someone with
> roots on the machine needs to step up.
>
>Pauli
>
>
Just a quick lesson from matplotlib's migration of sourceforge bugs to
github issues. Darren Dale did an excellent job with only a few hitches.
The key one is that *every* issue migrated spawns a new email. This got old
very fast. Second, because Darren did the migration, he became author for
every single issue. He then got every single status change of every issue
that we triaged the following few weeks.

 We don't hear much from Darren these days... I suspect the men in the
white coats took him away...

Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Issue Tracking

2012-05-05 Thread Benjamin Root
On Saturday, May 5, 2012, Charles R Harris wrote:

>
>
> On Sat, May 5, 2012 at 8:50 PM, Benjamin Root 
> 
> > wrote:
>
>>
>>
>> On Saturday, May 5, 2012, Pauli Virtanen wrote:
>>
>>> 05.05.2012 22:53, Ralf Gommers kirjoitti:
>>> [clip]
>>> > would be great to get it done by end of June.To Charles' list
>>> > and Ralf's suggestions, I would add setting up a server that can
>>> > relay pull requests to the mailing list.
>>> >
>>> > Don't know if you saw this, but it looks like Pauli is pretty far along
>>> > in fixing this problem:
>>> >
>>> http://thread.gmane.org/gmane.comp.python.numeric.general/49551/focus=49744
>>>
>>> The only thing missing is really only the server configuration ---
>>> new.scipy.org could in principle do that, but its mail system seems to
>>> be configured so that it cannot send mail to the MLs. So, someone with
>>> roots on the machine needs to step up.
>>>
>>>Pauli
>>>
>>>
>> Just a quick lesson from matplotlib's migration of sourceforge bugs to
>> github issues. Darren Dale did an excellent job with only a few hitches.
>> The key one is that *every* issue migrated spawns a new email. This got old
>> very fast. Second, because Darren did the migration, he became author for
>> every single issue. He then got every single status change of every issue
>> that we triaged the following few weeks.
>>
>>  We don't hear much from Darren these days... I suspect the men in the
>> white coats took him away...
>>
>>
> Uh oh. We are short on developers as is... Which brings up a question, do
> people need a github account to open an issue?
>
> Chuck
>

Last time I checked, yes. But this hasn't seemed to slow things down for us.

Ben Root


P.S. - there are probably ways around the issues I described. I just
mentioning them so that whoever prepares the migration could look out for
those problems.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Missing data wrap-up and request for comments

2012-05-09 Thread Benjamin Root
On Wednesday, May 9, 2012, Nathaniel Smith wrote:

>
>
> My only objection to this proposal is that committing to this approach
> seems premature. The existing masked array objects act quite
> differently from numpy.ma, so why do you believe that they're a good
> foundation for numpy.ma, and why will users want to switch to their
> semantics over numpy.ma's semantics? These aren't rhetorical
> questions, it seems like they must have concrete answers, but I don't
> know what they are.
>

Based on the design decisions made in the original NEP, a re-made
numpy.mawould have to lose _some_ features particularly, the ability
to share
masks. Save for that and some very obscure behaviors that are undocumented,
it is possible to remake numpy.ma as a compatibility layer.

That being said, I think that there are some fundamental questions that has
concerned. If I recall, there were unresolved questions about behaviors
surrounding assignments to elements of a view.

I see the project as broken down like this:
1.) internal architecture (largely abi issues)
2.) external architecture (hooks throughout numpy to utilize the new
features where possible such as where= argument)
3.) getter/setter semantics
4.) mathematical semantics

At this moment, I think we have pieces of 2 and they are fairly
non-controversial. It is 1 that I see as being the immediate hold-up here.
3 & 4 are non-trivial, but because they are mostly about interfaces, I
think we can be willing to accept some very basic, fundamental, barebones
components here in order to lay the groundwork for a more complete API
later.

To talk of Travis's proposal, doing nothing is no-go. Not moving forward
would dishearten the community. Making a ndmasked type is very intriguing.
I see it as a set towards eventually deprecating ndarray? Also, how would
it behave with no.asarray() and no.asanyarray()? My other concern is a
possible violation of DRY. How difficult would it be to maintain two
ndarrays in parallel?

As for the flag approach, this still doesn't solve the problem of legacy
code (or did I misunderstand?)

Cheers!
Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] spurious space in printing record arrays?

2012-05-10 Thread Benjamin Root
Just noticed this in the output from printing some numpy record arrays:

[[('2008081712', -24, -78.0, 20.10381469727, 45.0, -999.0, 0.0)]
 [ ('2008081718', -18, -79.584741211, 20.70762939453, 45.0, -999.0,
0.0)]
 [ ('2008081800', -12, -80.3305175781, 21.10381469727, 45.0,
-999.0, 0.0)]
 [ ('2008081806', -6, -80.8305175781, 21.89618530273, 45.0, -999.0,
0.0)]
 [ ('2008081812', 0, -81.1694824219, 23.20762939453, 50.0, -999.0,
1002.0)]]


[[ ('2008081812', 0, -81.1694824219, 23.20762939453, 50.0, -999.0,
0.0)]
 [('2008081815', 3, -81.5, 23.60381469727, 50.0, -999.0, 1003.0)]
 [ ('2008081900', 12, -81.8305175781, 24.60381469727, 55.0, -999.0,
0.0)]
 [ ('2008081912', 24, -82.084741211, 26.20762939453, 65.0, -999.0,
0.0)]
 [('2008082000', 36, -82.0, 27.79237060547, 50.0, -999.0, 0.0)]
 [ ('2008082012', 48, -81.8305175781, 29.29237060547, 40.0, -999.0,
0.0)]
 [('2008082112', 72, -81.5, 31.5, 35.0, -999.0, 0.0)]
 [('2008082212', 96, -81.5, 33.58474121094, 25.0, -999.0, 0.0)]
 [('2008082312', 120, -82.5, 35.5, 20.0, -999.0, 0.0)]]

On my 80-character wide terminal window, each line that gets wrapped also
has an extra space after the inner square bracket.  Coincidence? Using
v1.6.1

I don't think it is a big problem... just odd.

Thanks,
Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] ANN: NumPy 1.6.2 release candidate 1

2012-05-11 Thread Benjamin Root
On Sat, May 5, 2012 at 2:15 PM, Ralf Gommers wrote:

> Hi,
>
> I'm pleased to announce the availability of the first release candidate of
> NumPy 1.6.2.  This is a maintenance release. Due to the delay of the NumPy
> 1.7.0, this release contains far more fixes than a regular NumPy bugfix
> release.  It also includes a number of documentation and build improvements.
>
> Sources and binary installers can be found at
> https://sourceforge.net/projects/numpy/files/NumPy/1.6.2rc1/
>
> Please test this release and report any issues on the numpy-discussion
> mailing list.
>
> Cheers,
> Ralf
>
>
>
> ``numpy.core`` issues fixed
> ---
>
> #2063  make unique() return consistent index
> #1138  allow creating arrays from empty buffers or empty slices
> #1446  correct note about correspondence vstack and concatenate
> #1149  make argmin() work for datetime
> #1672  fix allclose() to work for scalar inf
> #1747  make np.median() work for 0-D arrays
> #1776  make complex division by zero to yield inf properly
> #1675  add scalar support for the format() function
> #1905  explicitly check for NaNs in allclose()
> #1952  allow floating ddof in std() and var()
> #1948  fix regression for indexing chararrays with empty list
> #2017  fix type hashing
> #2046  deleting array attributes causes segfault
> #2033  a**2.0 has incorrect type
> #2045  make attribute/iterator_element deletions not segfault
> #2021  fix segfault in searchsorted()
> #2073  fix float16 __array_interface__ bug
>
>
> ``numpy.lib`` issues fixed
> --
>
> #2048  break reference cycle in NpzFile
> #1573  savetxt() now handles complex arrays
> #1387  allow bincount() to accept empty arrays
> #1899  fixed histogramdd() bug with empty inputs
> #1793  fix failing npyio test under py3k
> #1936  fix extra nesting for subarray dtypes
> #1848  make tril/triu return the same dtype as the original array
> #1918  use Py_TYPE to access ob_type, so it works also on Py3
>
>
> ``numpy.f2py`` changes
> --
>
> ENH:   Introduce new options extra_f77_compiler_args and
> extra_f90_compiler_args
> BLD:   Improve reporting of fcompiler value
> BUG:   Fix f2py test_kind.py test
>
>
> ``numpy.poly`` changes
> --
>
> ENH:   Add some tests for polynomial printing
> ENH:   Add companion matrix functions
> DOC:   Rearrange the polynomial documents
> BUG:   Fix up links to classes
> DOC:   Add version added to some of the polynomial package modules
> DOC:   Document xxxfit functions in the polynomial package modules
> BUG:   The polynomial convenience classes let different types interact
> DOC:   Document the use of the polynomial convenience classes
> DOC:   Improve numpy reference documentation of polynomial classes
> ENH:   Improve the computation of polynomials from roots
> STY:   Code cleanup in polynomial [*]fromroots functions
> DOC:   Remove references to cast and NA, which were added in 1.7
>
>
> ``numpy.distutils`` issues fixed
> ---
>
> #1261  change compile flag on AIX from -O5 to -O3
> #1377  update HP compiler flags
> #1383  provide better support for C++ code on HPUX
> #1857  fix build for py3k + pip
> BLD:   raise a clearer warning in case of building without cleaning up
> first
> BLD:   follow build_ext coding convention in build_clib
> BLD:   fix up detection of Intel CPU on OS X in system_info.py
> BLD:   add support for the new X11 directory structure on Ubuntu & co.
> BLD:   add ufsparse to the libraries search path.
> BLD:   add 'pgfortran' as a valid compiler in the Portland Group
> BLD:   update version match regexp for IBM AIX Fortran compilers.
>
>
>
I just noticed that my fix for the np.gradient() function isn't listed.
https://github.com/numpy/numpy/pull/167

Not critical, but if a second rc is needed for any reason, it would be nice
to have that in there.

Thanks!
Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] stable sort for structured dtypes?

2012-05-11 Thread Benjamin Root
Hello all,

I need to sort a structured array in a stable manner.  I am also sorting
only by one of the keys, so I don't think lexsort() is stable in that
respect.  np.sort() allows for choosing 'mergesort', but it appears to not
be implemented for structured arrays.  Am I going to have to create a new
plain array out of the one column I want to sort by, and run np.artsort()
with the mergesort in order to get around this?  Or is there something more
straightforward that I am missing?

Thanks,
Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] stable sort for structured dtypes?

2012-05-11 Thread Benjamin Root
On Fri, May 11, 2012 at 12:00 PM, Charles R Harris <
charlesr.har...@gmail.com> wrote:

>
>
> On Fri, May 11, 2012 at 9:01 AM, Benjamin Root  wrote:
>
>> Hello all,
>>
>> I need to sort a structured array in a stable manner.  I am also sorting
>> only by one of the keys, so I don't think lexsort() is stable in that
>> respect.  np.sort() allows for choosing 'mergesort', but it appears to not
>> be implemented for structured arrays.  Am I going to have to create a new
>> plain array out of the one column I want to sort by, and run np.artsort()
>> with the mergesort in order to get around this?  Or is there something more
>> straightforward that I am missing?
>>
>
> Lexsort is just a sequence of indirect merge sorts, so using it to sort on
> a single column is the same as calling argsort(..., kind='mergesort').
>
> Mergesort (and heapsort) need to be extended to object arrays and arrays
> with specified comparison functions. I think that would be an interesting
> project for someone, I've been intending to do it myself but haven't got
> around to it.
>
> But as to your current problem, you probably need to have the keys in a
> plain old array. They also need to be in a contiguous array, but the sort
> methods take care of that by making contiguous copies when needed. Adding a
> step parameter to the sorts is another small project for someone. There is
> an interesting trade off there involving cache vs copy time vs memory usage.
>
> Chuck
>
>
Ok, that clears it up for me.  I ended up just doing an
argsort(np.array(d['vtime']), kind=...) and use the indices as a guide.  My
purpose didn't require a resorted array anyway, so this will do for now.

Thanks,
Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Should arr.diagonal() return a copy or a view? (1.7 compatibility issue)

2012-05-12 Thread Benjamin Root
On Saturday, May 12, 2012, Travis Oliphant wrote:

> Another approach would be to introduce a method:
>
> a.diag(copy=False)
>
> and leave a.diagonal() alone.  Then, a.diagonal() could be deprecated over
> 2-3 releases.
>
> -Travis
>


+1

Ben Root

>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Should arr.diagonal() return a copy or a view? (1.7 compatibility issue)

2012-05-16 Thread Benjamin Root
On Wed, May 16, 2012 at 9:55 AM, Nathaniel Smith  wrote:

> On Tue, May 15, 2012 at 2:49 PM, Frédéric Bastien  wrote:
> > Hi,
> >
> > In fact, I would arg to never change the current behavior, but add the
> > flag for people that want to use it.
> >
> > Why?
> >
> > 1) There is probably >10k script that use it that will need to be
> > checked for correctness. There won't be easy to see crash or error
> > that allow user to see it.
>
> My suggestion is that we follow the scheme, which I think gives ample
> opportunity for people to notice problems:
>
> 1.7: works like 1.6, except that a DeprecationWarning is produced if
> (and only if) someone writes to an array returned by np.diagonal (or
> friends). This gives a pleasant heads-up for those who pay attention
> to DeprecationWarnings.
>
> 1.8: return a view, but mark this view read-only. This causes crashes
> for anyone who ignored the DeprecationWarnings, guaranteeing that
> they'll notice the issue.
>
> 1.9: return a writeable view, transition complete.
>
> I've written a pull request implementing the first part of this; I
> hope everyone interested will take a look:
>  https://github.com/numpy/numpy/pull/280
>
> > 2) This is a globally not significant speed up by this change. Due to
> > 1), i think it is not work it. Why this is not a significant speed up?
> > First, the user already create and use the original tensor. Suppose a
> > matrix of size n x n. If it don't fit in the cache, creating it will
> > cost n * n. But coping it will cost cst * n. The cst is the price of
> > loading a full cache line. But if you return a view, you will pay this
> > cst price later when you do the computation. But it all case, this is
> > cheap compared to the cost of creating the matrix. Also, you will do
> > work on the matrix and this work will be much more costly then the
> > price of the copy.
> >
> > In the case the matrix fix in the cache, the price of the copy is even
> lower.
> >
> > So in conclusion, optimizing the diagonal won't give speed up in the
> > global user script, but will break many of them.
>
> I agree that the speed difference is small. I'm more worried about the
> cost to users of having to remember odd inconsistencies like this, and
> to think about whether there actually is a speed difference or not,
> etc. (If we do add a copy=False option, then I guarantee many people
> will use it religiously "just in case" the speed difference is enough
> to matter! And that would suck for them.)
>
> Returning a view makes the API slightly nicer, cleaner, more
> consistent, more useful. (I believe the reason this was implemented in
> the first place was that providing a convenient way to *write* to the
> diagonal of an arbitrary array made it easier to implement numpy.eye
> for masked arrays.) And the whole point of numpy is to trade off a
> little speed in favor of having a simple, easy-to-work with high-level
> API :-).
>
> -- Nathaniel
>

Just as a sanity check, do the scipy tests run without producing any such
messages?

Cheers!
Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] python import question

2012-05-18 Thread Benjamin Root
On Friday, May 18, 2012, Chao YUE wrote:

> Dear all,
>
> This is only a small python import question. I think I'm right but just
> want some confirmation.
>
> Previously I have installed numpy 1.5.1. and then I used pip install
> --upgrade numpy
> to install numpy 1.6.1
>
> But when I try to import numpy as np within ipython shell, I still get the
> version 1.5.1
>
> then I checked my sys.path:
>
> In [21]: sys.path
> Out[21]:
> ['',
>  '/usr/local/bin',
>  '/usr/local/lib/python2.7/dist-packages/pupynere-1.0.15-py2.7.egg',
>  '/usr/lib/pymodules/python2.7',
>
>  '/usr/local/lib/python2.7/dist-packages/scikits.statsmodels-0.3.1-py2.7.egg',
>
>  '/usr/local/lib/python2.7/dist-packages/Shapely-1.2.13-py2.7-linux-i686.egg',
>
>  '/usr/local/lib/python2.7/dist-packages/pandas-0.7.3-py2.7-linux-i686.egg',
>  '/home/chaoyue/python/python_lib',
>  '/usr/lib/python2.7',
>  '/usr/lib/python2.7/plat-linux2',
>  '/usr/lib/python2.7/lib-tk',
>  '/usr/lib/python2.7/lib-old',
>  '/usr/lib/python2.7/lib-dynload',
>  '/usr/local/lib/python2.7/dist-packages',
>  '/usr/lib/python2.7/dist-packages',
>  '/usr/lib/python2.7/dist-packages/PIL',
>  '/usr/lib/pymodules/python2.7/gtk-2.0',
>  '/usr/lib/python2.7/dist-packages/gst-0.10',
>  '/usr/lib/python2.7/dist-packages/gtk-2.0',
>  '/usr/lib/pymodules/python2.7/ubuntuone-client',
>  '/usr/lib/pymodules/python2.7/ubuntuone-control-panel',
>  '/usr/lib/pymodules/python2.7/ubuntuone-storage-protocol',
>  '/usr/lib/python2.7/dist-packages/wx-2.8-gtk2-unicode',
>  '/usr/local/lib/python2.7/dist-packages/IPython/extensions']
>
> Actually I found I have numpy 1.5.1 in /usr/lib/pymodules/python2.7
>
> and numpy 1.6.1 in /usr/local/lib/python2.7/dist-packages/numpy/
>
> but because the first path is before the second one in sys.path, so
> ipython imports only the first one and ignore the second one.
> Then I delete the directory of /usr/lib/pymodules/python2.7/numpy and redo
> the import, I get the version 1.6.1
>
> This means that import will try to find the first occurrence of the module
> and will ignore the ones with same name in later occurrences?
>
> cheers,
>
> Chao
>
>
Yes.  This is actually very common.  The $PATH environment variable works
the same way for finding executables.

Ben Root

>
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Checking for views (was: Should arr.diagonal() return a copy or aview?)

2012-05-24 Thread Benjamin Root
On Thu, May 24, 2012 at 10:56 AM, Jonathan T. Niehof wrote:

> On 05/23/2012 05:31 PM, T J wrote:
>
> > It seems that there are a number of ways to check if an array is a view.
> > Do we have a preferred way in the API that is guaranteed to stay
> > available? Or are all of the various methods "here to stay"?
>
> We've settled on checking array.base, which I think was the outcome of a
> stackoverflow thread that I can't dig up. (I'll check with the guy who
> wrote the code.)
>
>
Just as a quick word to the wise.  I think I can recall a situation where
this could be misleading.  In particular, I think it had to do with
boolean/fancy indexing of an array.  In some cases, what you get is a view
of the copy of the original data.  So, if you simply check to see if it is
a view, and then assume that because it is a view, it must be a view of the
original data, then that assumption can come back and bite you in strange
ways.

Cheers!
Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] indexes in an array where value is greater than 1?

2012-05-25 Thread Benjamin Root
On Fri, May 25, 2012 at 11:17 AM, Chris Withers wrote:

> Hi All,
>
> I have an array:
>
> arrrgh = numpy.zeros(1)
>
> A sparse collection of elements will have values greater than zero:
>
> arrrgh[] = 2
> arrrgh[3453453] =42
>
> The *wrong* way to do this is:
>
> for i in xrange(len(arrrgh)):
> if arrrgh[i] > 1:
> print i
>
> What's the right way?
>
> Chris
>
>
np.nonzero(arrrgh > 1)

Note, it returns a list of lists, one for each dimension of the input array.

Cheers!
Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] indexes in an array where value is greater than 1?

2012-05-27 Thread Benjamin Root
On Sunday, May 27, 2012, Chao YUE wrote:

> for me, np.nonzero() and np.where() both work. It seems they have same
> function.
>
> chao


They are not identical. Nonzeros is for indices. The where function is
really meant for a different purpose, but special-cases for this call
signature.

Ben Root

>
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] better error message possible?

2012-06-01 Thread Benjamin Root
On Fri, Jun 1, 2012 at 9:14 AM, Nathaniel Smith  wrote:

> On Fri, Jun 1, 2012 at 10:46 AM, Chris Withers 
> wrote:
> > Hi All,
> >
> > Any reason why this:
> >
> >  >>> import numpy
> >  >>> numpy.zeros(10)[-123]
> > Traceback (most recent call last):
> >   File "", line 1, in 
> > IndexError: index out of bounds
> >
> > ...could say this:
> >
> >  >>> numpy.zeros(10)[-123]
> > Traceback (most recent call last):
> >   File "", line 1, in 
> > IndexError: -123 is out of bounds
>
> Only that no-one has implemented it, I guess. If you want to then
> that'd be cool :-).
>
> To be generally useful for debugging, it would probably be good for
> the error message to also mention which dimension is involved, and/or
> the actual size of the array in that dimension. You can also get such
> error messages from expressions like 'arr[i, j, k]', after all, where
> it's even less obvious what went wrong.
>
> -- Nathaniel
>

+1, please!

Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] some typestrings not recognized anymore

2012-06-03 Thread Benjamin Root
On Sunday, June 3, 2012, Ralf Gommers wrote:

>
>
> On Sun, Jun 3, 2012 at 4:49 PM, Nathaniel Smith 
> 
> > wrote:
>
>> On Sun, Jun 3, 2012 at 3:28 PM, Ralf Gommers
>> > 'ralf.gomm...@googlemail.com');>> wrote:
>> > Hi,
>> >
>> > Just ran into this:
>> >
>>  np.__version__
>> > '1.5.1'
>>  np.empty((1,), dtype='>h2')  # works in 1.6.2 too
>> > array([0], dtype=int16)
>> >
>>  np.__version__
>> > '1.7.0.dev-fd78546'
>>  np.empty((1,), dtype='>h2')
>> > Traceback (most recent call last):
>> >   File "", line 1, in 
>> > TypeError: data type ">h2" not understood
>>
>> For reference the problem seems to be that in 1.6 and earlier, "h"
>> plus a number was allowed, and the number was ignored:
>>
>>  >>> np.__version__
>>  '1.5.1'
>>  >>> np.dtype("h2")
>>  dtype('int16')
>>  >>> np.dtype("h4")
>>  dtype('int16')
>>  >>> np.dtype("h100")
>>  dtype('int16')
>>
>> In current master, the number is disallowed -- all of those give
>> TypeErrors. Presumably because "h" already means the same as "i2", so
>> adding a second number on their is weird.
>>
>> Other typecodes with an "intrinsic size" seem to have the same problem
>> -- "q", "l", etc.
>>
>> Obviously "h2" should be allowed in 1.7, seeing as disallowing it
>> breaks scipy. And the behavior for "h100" is clearly broken and should
>> be disallowed in the long run. So I guess we need to do two things:
>>
>> 1) Re-enable the use of typecode + size specifier even in cases where
>> the typcode has an intrinsic size
>> 2) Issue a deprecation warning for cases where the intrinsic size and
>> the specified size don't match (like "h100"), and then turn that into
>> an error in 1.8.
>>
>> Does that sound correct?
>
>
> Seems correct as far as I can tell. Your approach to fixing the issue
> sounds good.
>
>
>> I guess the other option would be to
>> deprecate *all* use of size specifiers with these typecodes (i.e.,
>> deprecate "h2" as well, where the size specifier is merely redundant),
>> but I'm not sure removing that feature is really worth it.
>>
>
> Either way would be OK I think. Using "h2" is redundant, but I can see how
> someone could prefer writing it like that for clarity. It's not like 'h'
> --> np.int16 is obvious.
>
> Ralf
>


Also, we still need the number for some type codes such as 'a' to indicate
the length of the string.  I like the first solution much better.

Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] 1D array sorting ascending and descending by fields

2012-06-04 Thread Benjamin Root
On Monday, June 4, 2012, Chris Barker wrote:

> On Mon, Jun 4, 2012 at 11:10 AM, Patrick Redmond 
> >
> wrote:
> > Here's how I sorted primarily by field 'a' descending and secondarily by
> > field 'b' ascending:
>
> could you multiply the numeric field by -1, sort, then put it back --
> somethign like:
>
> data *- -1
> data_sorted = np.sort(data, order=['a','b'])
> data_sorted *= -1
>
> (reverse if necessary -- I lost track...)
>
> -Chris



While that may work for this users case, that would not work for all
dtypes. Some, such as timedelta, datetime and strings would not be able to
be multiplied by a number.

Would be an interesting feature to add, but I am not certain if the
negative sign notation would be best. Is it possible for a named field to
start with a negative sign?

Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] 1D array sorting ascending and descending by fields

2012-06-05 Thread Benjamin Root
On Tue, Jun 5, 2012 at 10:49 AM, Nathaniel Smith  wrote:

> On Tue, Jun 5, 2012 at 1:17 AM, Benjamin Root  wrote:
> >
> >
> > On Monday, June 4, 2012, Chris Barker wrote:
> >>
> >> On Mon, Jun 4, 2012 at 11:10 AM, Patrick Redmond 
> >> wrote:
> >> > Here's how I sorted primarily by field 'a' descending and secondarily
> by
> >> > field 'b' ascending:
> >>
> >> could you multiply the numeric field by -1, sort, then put it back --
> >> somethign like:
> >>
> >> data *- -1
> >> data_sorted = np.sort(data, order=['a','b'])
> >> data_sorted *= -1
> >>
> >> (reverse if necessary -- I lost track...)
> >>
> >> -Chris
> >
> >
> >
> > While that may work for this users case, that would not work for all
> dtypes.
> > Some, such as timedelta, datetime and strings would not be able to be
> > multiplied by a number.
> >
> > Would be an interesting feature to add, but I am not certain if the
> negative
> > sign notation would be best. Is it possible for a named field to start
> with
> > a negative sign?
>
> Maybe add a reverse= argument (named after the corresponding argument
> to list.sort and __builtins__.sorted).
>
> # sorts in descending order, no fields required
> np.sort([10, 20, 0], reverse=True)
> # sorts in descending order
> np.sort(rec_array, order=("a", "b"), reverse=True)
> # ascending by "a" then descending by "b"
> np.sort(rec_array, order=("a", "b"), reverse=(False, True))
>
> ?
>
> -n
>

Clear, unambiguous, and works with the existing framework.

+1

Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] varargs for logical_or, etc

2012-06-05 Thread Benjamin Root
On Tue, Jun 5, 2012 at 10:37 AM, Robert Kern  wrote:

> On Tue, Jun 5, 2012 at 2:54 PM, Neal Becker  wrote:
> > I think it's unfortunate that functions like logical_or are limited to
> binary.
> >
> > As a workaround, I've been using this:
> >
> > def apply_binary (func, *args):
> >if len (args) == 1:
> >return args[0]
> >elif len (args) == 2:
> >return func (*args)
> >else:
> >return func (
> >apply_binary (func, *args[:len(args)/2]),
> >apply_binary (func, *args[(len(args))/2:]))
> >
> > Then for example:
> >
> > punc2 = np.logical_and (u % 5 == 4,
> >   apply_binary (np.logical_or, u/5 == 3, u/5 == 8,
> u/5 ==
> > 13))
>
>
> reduce(np.logical_and, args)
>
>
I would love it if we could add something like that to the doc-string of
those functions because I don't think it is immediately obvious.  How do we
do that for ufuncs?

Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] boolean indexing of structured arrays

2012-06-06 Thread Benjamin Root
Not sure if this is a bug or not.  I am using a fairly recent master branch.

>>> # Setting up...
>>> import numpy as np
>>> a = np.zeros((10, 1), dtype=[('foo', 'f4'), ('bar', 'f4'), ('spam',
'f4')])
>>> a['foo'] = np.random.random((10, 1))
>>> a['bar'] = np.random.random((10, 1))
>>> a['spam'] = np.random.random((10, 1))
>>> a
array([[(0.8748096823692322, 0.08278043568134308, 0.2463584989309311)],
   [(0.27129432559013367, 0.9645473957061768, 0.41787904500961304)],
   [(0.4902191460132599, 0.6772263646125793, 0.07460898905992508)],
   [(0.13542482256889343, 0.8646988868713379, 0.98673015832901)],
   [(0.6527929902076721, 0.7392181754112244, 0.5919206738471985)],
   [(0.11248272657394409, 0.5818713903427124, 0.9287213087081909)],
   [(0.47561103105545044, 0.48848700523376465, 0.7108170390129089)],
   [(0.47087424993515015, 0.6080209016799927, 0.6583810448646545)],
   [(0.08447299897670746, 0.39479559659957886, 0.13520188629627228)],
   [(0.7074970006942749, 0.8426893353462219, 0.19329732656478882)]],
  dtype=[('foo', '>> b = (a['bar'] > 0.4)
>>> b
array([[False],
   [ True],
   [ True],
   [ True],
   [ True],
   [ True],
   [ True],
   [ True],
   [False],
   [ True]], dtype=bool)
>>> #  Boolean indexing of structured array with a (10,1) boolean array

>>> a[b]['foo']
array([ 0.27129433,  0.49021915,  0.13542482,  0.65279299,  0.11248273,
0.47561103,  0.47087425,  0.707497  ], dtype=float32)
>>> #  Boolean indexing of structured array with a (10,) boolean array

>>> a[b[:,0]]['foo']
array([[(0.27129432559013367, 0.9645473957061768, 0.41787904500961304)],
   [(0.4902191460132599, 0.6772263646125793, 0.07460898905992508)],
   [(0.13542482256889343, 0.8646988868713379, 0.98673015832901)],
   [(0.6527929902076721, 0.7392181754112244, 0.5919206738471985)],
   [(0.11248272657394409, 0.5818713903427124, 0.9287213087081909)],
   [(0.47561103105545044, 0.48848700523376465, 0.7108170390129089)],
   [(0.47087424993515015, 0.6080209016799927, 0.6583810448646545)],
   [(0.7074970006942749, 0.8426893353462219, 0.19329732656478882)]],
  dtype=[('foo', '___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Good way to develop numpy as popular choice!

2012-06-21 Thread Benjamin Root
On Thursday, June 21, 2012, Robert Kern wrote:

> On Thu, Jun 21, 2012 at 7:33 PM, eat >
> wrote:
> > Heh,
> >
> > On Thu, Jun 21, 2012 at 6:03 PM, Robert Kern 
> > >
> wrote:
> >>
> >> On Thu, Jun 21, 2012 at 3:59 PM, bob tnur 
> >> >
> wrote:
> >> > Hi all numpy fun;)
> >> > This question is already posted in stackoverflow by some people, I am
> >> > just
> >> > thinking that numpy python will do this with trick;) I guess numpy
> will
> >> > be
> >> > every ones choice as its popularity increases. The question is herein:
> >> >
> >> >
> http://stackoverflow.com/questions/10074270/how-can-i-find-the-minimum-number-of-lines-needed-to-cover-all-the-zeros-in-a-2
> >>
> >> My "numpy solution" for this is just
> >>
> >>  $ pip install munkres
> >
> > munkres seems to be a pure python implementation ;-).
>
> Oops! I could have sworn that I once tried one named munkres that used
> numpy. But that was several years ago.
>
>
There is a development branch of sk-learn with an implementation of the
hungarian assignment solver using numpy. It will even do non-square
matrices and matrices with an empty dimension.

Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Good way to develop numpy as popular choice!

2012-06-22 Thread Benjamin Root
On Fri, Jun 22, 2012 at 9:42 AM, eat  wrote:

> Hi,
>
> On Fri, Jun 22, 2012 at 7:51 AM, Gael Varoquaux <
> gael.varoqu...@normalesup.org> wrote:
>
>> On Thu, Jun 21, 2012 at 08:59:09PM -0400, Benjamin Root wrote:
>> >  > munkres seems to be a pure python implementation ;-).
>>
>> >  Oops! I could have sworn that I once tried one named munkres that
>> used
>> >  numpy. But that was several years ago.
>>
>> >There is a development branch of sk-learn with an implementation of
>> the
>> >hungarian assignment solver using numpy. It will even do non-square
>> >matrices and matrices with an empty dimension.
>>
>> Yes, absolutely, thanks to Ben:
>>
>> https://github.com/GaelVaroquaux/scikit-learn/blob/hungarian/sklearn/utils/hungarian.py
>> I never merged this in the main scikit-learn tree, because munkres is not
>> used so far. Maybe I should merge it in the main tree, or maybe it should
>> be added to scipy or numpy.
>>
> I made some simple timing comparisons (see attached picture) between numpy
> based hungarian and pure python shortest path based hungarian_sp. It seems
> that pure python based implementation outperforms numpy based
> implementation. Timings are averaged over five runs.
>
> The difference cannot totally be explained by different algorithms
> (although shortest path based seem to scale better).  Rather the heavy
> access to rows and columns seem to favor list of lists. So this type of
> algorithms may indeed be real challenges for numpy.
>
>
eat,

Thanks for that analysis.  Personally, I never needed high-performance so I
never bothered to optimize it.  However, it does appear that there is an
order-of-magnitude difference between the two, and so it might be worth it
to see what can be done to fix that.

Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Good way to develop numpy as popular choice!

2012-06-22 Thread Benjamin Root
On Fri, Jun 22, 2012 at 10:25 AM, Travis Oliphant wrote:

> Accessing individual elements of NumPy arrays is slower than accessing
> individual elements of lists --- around 2.5x-3x slower.NumPy has to do
> more work to figure out what kind of indexing you are trying to do because
> of its flexibility.   It also has to create the Python object to return.
>  In contrast, the list approach already has the Python objects created and
> you are just returning pointers to them and there is much less flexibility
> in the kinds of indexing you can do.
>
> Simple timings show that a.item(i,j) is about 2x slower than list element
> access (but faster than a[i,j] which is about 2.5x to 3x slower).   The
> slowness of a.item is due to the need to create the Python object to return
> (there are just raw bytes there) so it gives some idea of the relative cost
> of each part of the slowness of a[i,j].
>
> Also, math on the array scalars returned from NumPy will be slower than
> math on integers and floats --- because NumPy re-uses the ufunc machinery
> which is not optimized at all for scalars.
>
> The take-away is that NumPy is built for doing vectorized operations on
> bytes of data.   It is not optimized for doing element-by-element
> individual access.The right approach there is to just use lists (or use
> a version specialized for the kind of data in the lists that removes the
> boxing and unboxing).
>
> Here are my timings using IPython for NumPy indexing:
>
> 1-D:
>
> In[2]: a = arange(100)
>
> In [3]: %timeit [a.item(i) for i in xrange(100)]
> 1 loops, best of 3: 25.6 us per loop
>
> In [4]: %timeit [a[i] for i in xrange(100)]
> 1 loops, best of 3: 31.8 us per loop
>
> In [5]: al = a.tolist()
>
> In [6]: %timeit [al[i] for i in xrange(100)]
> 10 loops, best of 3: 10.6 us per loop
>
>
>
> 2-D:
>
> In [7]: a = arange(100).reshape(10,10)
>
> In [8]: al = a.tolist()
>
> In [9]: %timeit [al[i][j] for i in xrange(10) for j in xrange(10)]
> 1 loops, best of 3: 18.6 us per loop
>
> In [10]: %timeit [a[i,j] for i in xrange(10) for j in xrange(10)]
> 1 loops, best of 3: 44.4 us per loop
>
> In [11]: %timeit [a.item(i,j) for i in xrange(10) for j in xrange(10)]
> 1 loops, best of 3: 34.2 us per loop
>
>
>
> -Travis
>
>
However, what is the timing/memory cost of converting a large numpy array
that already exists into python list of lists?  If all my processing before
the munkres step is using NumPy, converting it into python lists has a
cost.  Also, your timings indicate only ~2x slowdown, while the timing
tests done by eat show an order-of-magnitude difference.  I suspect there
is great room for improvement before even starting to worry about the array
access issues.

Cheers!
Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Created NumPy 1.7.x branch

2012-06-25 Thread Benjamin Root
On Mon, Jun 25, 2012 at 1:41 PM, Travis Oliphant wrote:

>
>> C was famous for bugs due to the lack of function prototypes. This was
>> fixed with C99 and the stricter typing was a great help.
>>
>>
>> Bugs are not "due to lack of function prototypes".  Bugs are due to
>> mistakes that programmers make (and I know all about mistakes programmers
>> make).  Function prototypes can help detect some kinds of mistakes which is
>> helpful.   But, this doesn't help the question of how to transition a
>> weakly-typed program or whether or not that is even a useful exercise.
>>
>
> Oh, come on. Writing correct C code used to be a guru exercise. A friend
> of mine, a Putnam fellow, was the Weitek guru for drivers. To say bugs are
> programmer mistakes is information free, the question is how to minimize
> programmer mistakes.
>
>
> Bugs *are* programmer mistakes.   Let's put responsibility where it lies.
>   Of course, writing languages that help programmers make fewer mistakes
> (or catch them earlier when they do) are a good thing.I'm certainly not
> arguing against that.
>
> But, I reiterate that just because a better way to write new code under
> some metric is discovered or understood does not mean that all current code
> should be re-written to use that style.   That's the only comment I'm
> making.
>
> Also, you mention the lessons from Python 2 and Python 3, but I'm not sure
> we would agree on what those lessons actually were, so I wouldn't rely on
> that as a way of getting your point across if it matters.
>
> Best,
>
> -Travis
>
>
At the risk of starting a language flame war, my take of Charles' comment
about the lessons of python 3.0 is its success in getting packages
transitioned smoothly (still an on-going process), versus what happened
with Perl 5.  Perl 5 was a major change that happened all at once and
no-one adopted it for the longest time.  Meanwhile, python incremented
itself from the 2.x series to the 3.x series in a very nice manner with a
well-thought-out plan that was visible to all.

At least, that is my understanding and perception. Take it with as much
salt as you (or your doctor) desires.

Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Created NumPy 1.7.x branch

2012-06-26 Thread Benjamin Root
On Tue, Jun 26, 2012 at 12:48 PM, David Cournapeau wrote:

> On Tue, Jun 26, 2012 at 5:24 PM, Travis Oliphant 
> wrote:
> >
> >> Let us note that that problem was due to Travis convincing David to
> >> include the Datetime work in the release against David's own best
> judgement.
> >> The result was a delay of several months until Ralf could get up to
> speed
> >> and get 1.4.1 out. Let us also note that poly1d is actually not the
> same as
> >> Matlab poly1d.
> >>
> >>
> >> This is not accurate, Charles.  Please stop trying to dredge up old
> >> history you don't know the full story about and are trying to create an
> >> alternate reality about.   It doesn't help anything and is quite
> poisonous
> >> to this mailing list.
> >
> >
> > I didn't start the discussion of 1.4, nor did I raise the issue at the
> time
> > as I didn't think it would be productive. We moved forward. But in any
> case,
> > I asked David at the time why the datetime stuff got included. I'd
> welcome
> > your version if you care to offer it. That would be more useful than
> > accusing me of creating an alternative reality and would clear the air.
> >
> >
> > The datetime stuff got included because it is a very useful and important
> > feature for multiple users.   It still needed work, but it was in a state
> > where it could be tried.   It did require breaking ABI compatibility in
> the
> > state it was in.   My approach was to break ABI compatibility and move
> > forward (there were other things we could do at the time that are still
> > needed in the code base that will break ABI compatibility in the future).
> >  David didn't want to break ABI compatibility and so tried to satisfy two
> > competing desires in a way that did not ultimately work. These things
> > happen.We all get to share responsibility for the outcome.
>
> I think Chuck alludes to the fact that I was rather reserved about
> merging datetime before *anyone* knew about breaking the ABI. I don't
> feel responsible for this issue (except I maybe should have pushed
> more strongly about datetime being included), but I am also not
> interested in making a big deal out of it, certainly not two years
> after the fact. I am merely point this out so that you realize that
> you may both have a different view that could be seen as valid
> depending on what you are willing to highlight.
>
> I suggest that Chuck and you take this off-list,
>
> David
>

Or, we could raise funds for NumFOCUS by selling tickets for a brawl
between the two at SciPy2012...

I kid, I kid!

Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Matrix rank default tolerance - is it too low?

2012-06-26 Thread Benjamin Root
On Tuesday, June 26, 2012, Charles R Harris wrote:

>
>
> On Tue, Jun 26, 2012 at 3:42 PM, Matthew Brett wrote:
>
> Hi,
>
> On Mon, Jun 18, 2012 at 3:50 PM, Matthew Brett 
> wrote:
> > Hi,
> >
> > On Sun, Jun 17, 2012 at 7:22 PM, Charles R Harris
> >  wrote:
> >>
> >>
> >> On Sat, Jun 16, 2012 at 2:33 PM, Matthew Brett  >
> >> wrote:
> >>>
> >>> Hi,
> >>>
> >>> On Sat, Jun 16, 2012 at 8:03 PM, Matthew Brett <
> matthew.br...@gmail.com>
> >>> wrote:
> >>> > Hi,
> >>> >
> >>> > On Sat, Jun 16, 2012 at 10:40 AM, Nathaniel Smith 
> wrote:
> >>> >> On Fri, Jun 15, 2012 at 4:10 AM, Charles R Harris
> >>> >>  wrote:
> >>> >>>
> >>> >>>
> >>> >>> On Thu, Jun 14, 2012 at 8:06 PM, Matthew Brett
> >>> >>> 
> >>> >>> wrote:
> >>> 
> >>>  Hi,
> >>> 
> >>>  I noticed that numpy.linalg.matrix_rank sometimes gives full rank
> for
> >>>  matrices that are numerically rank deficient:
> >>> 
> >>>  If I repeatedly make random matrices, then set the first column
> to be
> >>>  equal to the sum of the second and third columns:
> >>> 
> >>>  def make_deficient():
> >>> X = np.random.normal(size=(40, 10))
> >>> deficient_X = X.copy()
> >>> deficient_X[:, 0] = deficient_X[:, 1] + deficient_X[:, 2]
> >>> return deficient_X
> >>> 
> >>>  then the current numpy.linalg.matrix_rank algorithm returns full
> rank
> >>>  (10) in about 8 percent of cases (see appended script).
> >>> 
> >>>  I think this is a tolerance problem.  The ``matrix_rank``
> algorithm
> >>>  does this by default:
> >>> 
> >>>  S = spl.svd(M, compute_uv=False)
> >>>  tol = S.max() * np.finfo(S.dtype).eps
> >>>  return np.sum(S > tol)
> >>> 
> >>>  I guess we'd we want the lowest tolerance that nearly always or
> >>>  always
> >>>  identifies numerically rank deficient matrices.  I suppose one
> way of
> >>>  looking at whether the tolerance is in the right range is to
> compare
> >>>  the calculated tolerance (``tol``) to the minimum singular value
> >>>  (``S.min()``) because S.min() in our case should be very small and
> >>>  indicate the rank deficiency. The mean value of tol / S.min() for
> the
> >>>  current algorithm, across many iterations, is about 2.8.  We might
> >>>  hope this value would be higher than 1, but not much higher,
> >>>  otherwise
> >>>  we might be rejecting too many columns.
> >>> 
> >>>  Our current algorithm for tolerance is the same as the 2-norm of
> M *
> >>>  eps.  We're citing Golub and Van Loan for this, but now I look at
> our
> >>>  copy (p 261, last para) - they seem to be suggesting using u * |M|
> >>>  where u = (p 61, section 2.4.2) eps /  2. (see [1]). I think the
> >>>  Golub
>
>
> I'm fine with that, and agree that it is likely to lead to fewer folks
> wondering why Matlab and numpy are different. A good explanation in the
> function documentation would be useful.
>
> Chuck
>
>
One potential problem is that it implies that it will always be the same as
any version of matlab's tolerance.  What if they change it in a future
release? How likely are we to even notice?

Cheers!
Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Created NumPy 1.7.x branch

2012-06-26 Thread Benjamin Root
On Tuesday, June 26, 2012, Thouis (Ray) Jones wrote:

> On Tue, Jun 26, 2012 at 10:11 PM, Jason Grout
> > wrote:
> > On 6/26/12 3:06 PM, Dag Sverre Seljebotn wrote:
> >> Something the Sage project does very well is meeting often in person
> >
> > Another thing we have that has improved the mailing list climate is a
> > "sage-flame" list [1]
>
> +1 !
>
> Speaking as someone trying to get started in contributing to numpy, I
> find this discussion extremely off-putting.  It's childish,
> meaningless, and spiteful, and I think it's doing more harm than any
> possible good that could come out of continuing it.


And if you still feel dissuaded from contributing here, you are always
welcome over at the matplotlib lists.



Cheers!
Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] "import numpy" performance

2012-07-02 Thread Benjamin Root
On Mon, Jul 2, 2012 at 4:34 PM, Nathaniel Smith  wrote:

> On Mon, Jul 2, 2012 at 8:17 PM, Andrew Dalke 
> wrote:
> > In this email I propose a few changes which I think are minor
> > and which don't really affect the external NumPy API but which
> > I think could improve the "import numpy" performance by at
> > least 40%. This affects me because I and my clients use a
> > chemistry toolkit which uses only NumPy arrays, and where
> > we run short programs often on the command-line.
> >
> >
> > In July of 2008 I started a thread about how "import numpy"
> > was noticeably slow for one of my customers. They had
> > chemical analysis software, often even run on a single
> > molecular structure using command-line tools, and the
> > several invocations with 0.1 seconds overhead was one of
> > the dominant costs even when numpy wasn't needed.
> >
> > I fixed most of their problems by deferring numpy imports
> > until needed. I remember well the Steve Jobs anecdote at
> >
> http://folklore.org/StoryView.py?project=Macintosh&story=Saving_Lives.txt
> > and spent another day of my time in 2008 to identify the
> > parts of the numpy import sequence which seemed excessive.
> > I managed to get the import time down from 0.21 seconds to
> > 0.08 seconds.
> >
> > Very little of that made it into NumPy.
> >
> >
> > The three biggest changes I would like are:
> >
> > 1) remove "add_newdocs" and put the docstrings in the C code
> >  'add_newdocs' still needs to be there,
> >
> > The code says:
> >
> > # This is only meant to add docs to objects defined in C-extension
> modules.
> > # The purpose is to allow easier editing of the docstrings without
> > # requiring a re-compile.
> >
> > However, the change log shows that there are relatively few commits
> > to this module
> >
> >   YearNumber of commits
> >   =
> >   2012   8
> >   2011  62
> >   2010   9
> >   2009  18
> >   2008  17
> >
> > so I propose moving the docstrings to the C code, and perhaps
> > leaving 'add_newdocs' there, but only used when testing new
> > docstrings.
>
> I don't have any opinion on how acceptable this would be, but I also
> don't see a benchmark showing how much this would help?
>
> > 2) Don't optimistically assume that all submodules are
> > needed. For example, some current code uses
> >
>  import numpy
>  numpy.fft.ifft
> > 
> >
> > (See a real-world example at
> >
> http://stackoverflow.com/questions/10222812/python-numpy-fft-and-inverse-fft
> > )
> >
> > IMO, this optimizes the needs of the interactive shell
> > NumPy author over the needs of the many-fold more people
> > who don't spend their time in the REPL and/or don't need
> > those extra features added to every NumPy startup. Please
> > bear in mind that NumPy users of the first category will
> > be active on the mailing list, go to SciPy conferences,
> > etc. while members of the second category are less visible.
> >
> > I recognize that this is backwards incompatible, and will
> > not change. However, I understand that "NumPy 2.0" is a
> > glimmer in the future, which might be a natural place for
> > a transition to the more standard Python style of
> >
> >   from numpy import fft
> >
> > Personally, I think the documentation now (if it doesn't
> > already) should transition to use this form.
>
> I think this ship has sailed, but it'd be worth looking into lazy
> importing, where 'numpy.fft' isn't actually imported until someone
> starts using it. There are a bunch of libraries that do this, and one
> would have to fiddle to get compatibility with all the different
> python versions and make sure you're not killing performance (might
> have to be in C) but something along the lines of
>
> class _FFTModule(object):
>   def __getattribute__(self, name):
> mod = importlib.import_module("numpy.fft")
> _FFTModule.__getattribute__ = mod.__getattribute__
> return getattr(mod, name)
> fft = _FFTModule()
>
>
Not sure how this would impact projects like ipython that does
tab-completion support, but I know that that would drive me nuts in my
basic tab-completion setup I have for my regular python terminal.  Of
course, in the grand scheme of things, that really isn't all that
important, I don't think.

Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Type specific sorts: objects, structured arrays, and all that.

2012-07-10 Thread Benjamin Root
On Tue, Jul 10, 2012 at 3:37 AM, Robert Kern  wrote:

> On Tue, Jul 10, 2012 at 4:32 AM, Charles R Harris
>  wrote:
> > Hi All,
> >
> > I've been adding type specific sorts for object and structured arrays. It
> > seems that datetime64 and timedelta64 are also not supported. Is there
> any
> > reason why those types should not be sorted as int64?
>
> You need special handling for NaTs to be consistent with how we deal
> with NaNs in floats.
>
>
Not sure if this is an issue or not, but different datetime64 objects can
be set for different units:
http://docs.scipy.org/doc/numpy/reference/arrays.datetime.html#datetime-units.
A straight-out comparison of the values as int64 would likely drop the
units, correct?  On second thought, though, I guess all datetime64's in a
numpy array would all have the same units, so it shouldn't matter, right?

Just thinking aloud.

Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Looking for the most important bugs, documentation needs, etc.

2012-07-10 Thread Benjamin Root
On Tue, Jul 10, 2012 at 6:07 AM, Ralf Gommers
wrote:

>
>
> On Tue, Jul 10, 2012 at 11:36 AM, Ralf Gommers <
> ralf.gomm...@googlemail.com> wrote:
>
>>
>>
>> On Tue, Jul 10, 2012 at 4:20 AM, Six Silberman 
>> wrote:
>>
>>> Hi all,
>>>
>>> Some colleagues and I are interested in contributing to numpy. We have
>>> a range of backgrounds -- I for example am new to contributing to open
>>> source software but have a (small) bit of background in scientific
>>> computation, while others have extensive experience contributing to
>>> open source projects. We've looked at the issue tracker and submitted
>>> a couple patches today but we would be interested to hear what active
>>> contributors to the project consider the most pressing, important,
>>> and/or interesting needs at the moment. I personally am quite
>>> interested in hearing about the most pressing documentation needs
>>> (including example code).
>>>
>>
>> As for important issues, I think many of them are related to the core of
>> numpy. But there's some more isolated ones, which is probably better to get
>> started. Here are some that are high on my list of things to fix/improve:
>>
>> - Numpy doesn't work well (or at all) on OS X 10.7 when built with
>> llvm-gcc, which is the default compiler on that platform. With Clang it
>> seems to work fine. Same for Scipy.
>> http://projects.scipy.org/numpy/ticket/1951
>>
>> - We don't have binary installers for Python 3.x on OS X yet. This
>> requires adapting the installer build scripts that work for 2.x. See
>> pavement.py in the base dir of the repo.
>>
>> - Something that's more straightforward: improving test coverage. It's
>> lacking in a number of places; one of the things that comes to mind is that
>> all functions should be tested for correct behavior with empty input.
>> Normally the expected behavior is empty in --> empty out. When that's not
>> tested, we get things like http://projects.scipy.org/numpy/ticket/2078.
>> Ticket for "empty" test coverage:
>> http://projects.scipy.org/numpy/ticket/2007
>>
>> - There's a large amount of "normal" bugs, working on any of those would
>> be very helpful too. Hard to say here which ones out of the several hundred
>> are important. It is safe to say though I think that the ones requiring
>> touching the C code are more in need of attention than the pure Python ones.
>>
>>
>> I see a patch for f2py already, and a second ticket opened. This is of
>> course useful, but not too many devs are working on it. Unless Pearu has
>> time to respond this week, it may be hard to get feedback on that topic
>> quickly.
>>
>
> Here are some relatively straightforward issues which only require
> touching Python code:
>
> http://projects.scipy.org/numpy/ticket/808
> http://projects.scipy.org/numpy/ticket/1968
> http://projects.scipy.org/numpy/ticket/1976
> http://projects.scipy.org/numpy/ticket/1989
>
> And a Cython one (numpy.random):
> http://projects.scipy.org/numpy/ticket/1492
>
> I ran into one more patch that I assume one of you just attached:
> http://projects.scipy.org/numpy/ticket/2074. It's important to understand
> a little of how our infrastructure works. We changed to git + github last
> year; submitting patches as pull requests on Github has the lowest overhead
> for us, and we get notifications. For patches on Trac, we have to manually
> download and apply them. Plus we don't get notifications, which is quite
> unhelpful unfortunately. Therefore I suggest using git, and if you can't or
> you feel that the overhead / learning curve is too large, please ping this
> mailing list about patches you submit on Trac.
>
> Cheers,
> Ralf
>
>
By the way, for those who are looking to learn how to use git and github:

https://github.com/blog/1183-try-git-in-your-browser

Cheers!
Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] build numpy 1.6.2

2012-07-10 Thread Benjamin Root
On Tue, Jul 10, 2012 at 2:45 PM, Prakash Joshi  wrote:

>  Hi All,
>
>  I built numpy 1.6.2 on linux 64 bit and installed numpy in
> site-packages,  It pass all the test cases of numpy, but I am not sure if
> this is good build; As I did not specified any fortran compiler while
> setup, also I do not have fortran compiler on my machine.
>
>  Thanks
> Prakash
>
>
NumPy does not need Fortran for its build.  SciPy, however, does.

Cheers!
Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] build numpy 1.6.2

2012-07-10 Thread Benjamin Root
Prakash,

On Tue, Jul 10, 2012 at 3:26 PM, Prakash Joshi  wrote:

>  Thanks Ben.
>
>  Also I did not specified any of BLAS, LAPACK, ATLAS libraries, do we
> need these libraries for numpy?
>

"Need", no, you do not "need" them in the sense that NumPy does not require
them to work.  NumPy will work just fine without those libraries.  However,
if you "want" them, then that is where the choice of Fortran compiler comes
in.  Look at the INSTALL.txt file for more detailed instructions.


>  I simply used following command to build:
>   python setup.py build
>   python setup.py install —prefix=/usr/local
>
>  If above commands are sufficient, than I hope same steps to build will
> work on Mac OSX?
>
>
That entirely depends on your development setup on your Mac.  I will leave
that discussion up to others on the list to answer.

Cheers!
Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Remove current 1.7 branch?

2012-07-12 Thread Benjamin Root
On Thursday, July 12, 2012, Thouis (Ray) Jones wrote:

> On Thu, Jul 12, 2012 at 1:28 AM, Charles R Harris
> > wrote:
> > Hi All,
> >
> > Travis and I agree that it would be appropriate to remove the current
> 1.7.x
> > branch and branch again after a code freeze. That way we can avoid the
> pain
> > and potential errors of backports. It is considered bad form to mess with
> > public repositories that way, so another option would be to rename the
> > branch, although I'm not sure how well that would work. Suggestions?
>
> I might be mistaken, but if the branch is merged into master (even if
> that merge makes no changes), I think it's safe to delete it at that
> point (and recreate it at a later date with the same name) with
> regards to remote repositories.  It should be fairly easy to test.
>
> Ray Jones


No, that is not the case.  We had a situation occur awhile back where one
of the public branches of mpl got completely messed up.  You can't even
rename it since the rename doesn't occur in the pulls and merges.

What we ended up doing was creating a brand new branch "v1.0.x-maint" and
making sure all the devs knew to switch over to that.  You might even go a
step further and make a final commit to the bad branch that makes the build
fail with a big note explaining what to do.

Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Remove current 1.7 branch?

2012-07-12 Thread Benjamin Root
On Thursday, July 12, 2012, Nathaniel Smith wrote:

> On Thu, Jul 12, 2012 at 12:48 PM, Benjamin Root 
> >
> wrote:
> >
> >
> > On Thursday, July 12, 2012, Thouis (Ray) Jones wrote:
> >>
> >> On Thu, Jul 12, 2012 at 1:28 AM, Charles R Harris
> >> > wrote:
> >> > Hi All,
> >> >
> >> > Travis and I agree that it would be appropriate to remove the current
> >> > 1.7.x
> >> > branch and branch again after a code freeze. That way we can avoid the
> >> > pain
> >> > and potential errors of backports. It is considered bad form to mess
> >> > with
> >> > public repositories that way, so another option would be to rename the
> >> > branch, although I'm not sure how well that would work. Suggestions?
> >>
> >> I might be mistaken, but if the branch is merged into master (even if
> >> that merge makes no changes), I think it's safe to delete it at that
> >> point (and recreate it at a later date with the same name) with
> >> regards to remote repositories.  It should be fairly easy to test.
> >>
> >> Ray Jones
> >
> >
> > No, that is not the case.  We had a situation occur awhile back where
> one of
> > the public branches of mpl got completely messed up.  You can't even
> rename
> > it since the rename doesn't occur in the pulls and merges.
> >
> > What we ended up doing was creating a brand new branch "v1.0.x-maint" and
> > making sure all the devs knew to switch over to that.  You might even go
> a
> > step further and make a final commit to the bad branch that makes the
> build
> > fail with a big note explaining what to do.
>
> The branch isn't bad, it's just out of date. So long as the new
> version of the branch has the current version of the branch in its
> ancestry, then everything will be fine.
>
> Option 1:
>   git checkout master
>   git merge maint1.7.x
>   git checkout maint1.7.x
>   git merge master # will be a fast-forward
>
> Option 2:
>   git checkout master
>   git merge maint1.7.x
>   git branch -d maint1.7.x  # delete the branch
>   git checkout -b maint1.7.x  # recreate it
>
> In git terms these two options are literally identical; they result in
> the exact same repo state...
>
> -N


Ah, I misunderstood.  Then yes, I think this is correct.

Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] use slicing as argument values?

2012-07-12 Thread Benjamin Root
On Thu, Jul 12, 2012 at 3:38 PM, Chao YUE  wrote:

> Dear all,
>
> I want to create a function and I would like one of the arguments of the
> function to determine what slicing of numpy array I want to use.
> a simple example:
>
> a=np.arange(100).reshape(10,10)
>
> suppose I want to have a imaging function to show image of part of this
> data:
>
> def show_part_of_data(m,n):
> plt.imshow(a[m,n])
>
> like I can give m=3:5, n=2:7, when I call function
> show_part_of_data(3:5,2:7), this means I try to do plt.imshow(a[3:5,2:7]).
> the above example doesn't work in reality. but it illustrates something
> similar that I desire, that is, I can specify what slicing of
> number array I want by giving values to function arguments.
>
> thanks a lot,
>
> Chao
>
>

What you want to do is create slice objects.

a[3:5]

is equivalent to

sl = slice(3, 5)
a[sl]


and

a[3:5, 5:14]

is equivalent to

sl = (slice(3, 5), slice(5, 14))
a[sl]

Furthermore, notation such as "::-1" is equivalent to slice(None, None, -1)

I hope this helps!
Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] use slicing as argument values?

2012-07-12 Thread Benjamin Root
On Thu, Jul 12, 2012 at 4:46 PM, Chao YUE  wrote:

> Hi Ben,
>
> it helps a lot. I am nearly finishing a function in a way I think
> pythonic.
> Just one more question, I have:
>
> In [24]: b=np.arange(1,11)
>
> In [25]: b
> Out[25]: array([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10])
>
> In [26]: b[slice(1)]
> Out[26]: array([1])
>
> In [27]: b[slice(4)]
> Out[27]: array([1, 2, 3, 4])
>
> In [28]: b[slice(None,4)]
> Out[28]: array([1, 2, 3, 4])
>
> so slice(4) is actually slice(None,4), how can I exactly want retrieve
> a[4] using slice object?
>
> thanks again!
>
> Chao
>
>
Tricky question.  Note the difference between

a[4]

and

a[4:5]

The first returns a scalar, while the second returns an array.  The first,
though, is not a slice, just an integer.

Also, note that the arguments for slice() behaves very similar to the
arguments for range() (with some exceptions/differences).

Cheers!
Ben Root



> 2012/7/12 Benjamin Root 
>
>>
>>
>> On Thu, Jul 12, 2012 at 3:38 PM, Chao YUE  wrote:
>>
>>> Dear all,
>>>
>>> I want to create a function and I would like one of the arguments of the
>>> function to determine what slicing of numpy array I want to use.
>>> a simple example:
>>>
>>> a=np.arange(100).reshape(10,10)
>>>
>>> suppose I want to have a imaging function to show image of part of this
>>> data:
>>>
>>> def show_part_of_data(m,n):
>>> plt.imshow(a[m,n])
>>>
>>> like I can give m=3:5, n=2:7, when I call function
>>> show_part_of_data(3:5,2:7), this means I try to do plt.imshow(a[3:5,2:7]).
>>> the above example doesn't work in reality. but it illustrates something
>>> similar that I desire, that is, I can specify what slicing of
>>> number array I want by giving values to function arguments.
>>>
>>> thanks a lot,
>>>
>>> Chao
>>>
>>>
>>
>> What you want to do is create slice objects.
>>
>> a[3:5]
>>
>> is equivalent to
>>
>> sl = slice(3, 5)
>> a[sl]
>>
>>
>> and
>>
>> a[3:5, 5:14]
>>
>> is equivalent to
>>
>> sl = (slice(3, 5), slice(5, 14))
>> a[sl]
>>
>> Furthermore, notation such as "::-1" is equivalent to slice(None, None,
>> -1)
>>
>> I hope this helps!
>> Ben Root
>>
>>
>> ___
>> NumPy-Discussion mailing list
>> NumPy-Discussion@scipy.org
>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>
>>
>
>
> --
>
> ***
> Chao YUE
> Laboratoire des Sciences du Climat et de l'Environnement (LSCE-IPSL)
> UMR 1572 CEA-CNRS-UVSQ
> Batiment 712 - Pe 119
> 91191 GIF Sur YVETTE Cedex
> Tel: (33) 01 69 08 29 02; Fax:01.69.08.77.16
>
> 
>
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] use slicing as argument values?

2012-07-12 Thread Benjamin Root
On Thursday, July 12, 2012, Chao YUE wrote:

> Thanks all for the discussion. Actually I am trying to use something like
> numpy ndarray indexing in the function. Like when I call:
>
> func(a,'1:3,:,2:4'), it knows I want to retrieve a[1:3,:,2:4], and
> func(a,'1:3,:,4') for a[1:3,:,4] ect.
> I am very close now.
>
> #so this function changes the string to list of slice objects.
> def convert_string_to_slice(slice_string):
> """
> provide slice_string as '2:3,:', it will return [slice(2, 3, None),
> slice(None, None, None)]
> """
> slice_list=[]
> split_slice_string_list=slice_string.split(',')
> for sub_slice_string in split_slice_string_list:
> split_sub=sub_slice_string.split(':')
> if len(split_sub)==1:
> sub_slice=slice(int(split_sub[0]))
> else:
> if split_sub[0]=='':
> sub1=None
> else:
> sub1=int(split_sub[0])
> if split_sub[1]=='':
> sub2=None
> else:
> sub2=int(split_sub[1])
> sub_slice=slice(sub1,sub2)
> slice_list.append(sub_slice)
> return slice_list
>
> In [119]: a=np.arange(3*4*5).reshape(3,4,5)
>
> for this it works fine.
> In [120]: convert_string_to_slice('1:3,:,2:4')
> Out[120]: [slice(1, 3, None), slice(None, None, None), slice(2, 4, None)]
>
> In [121]: a[slice(1, 3, None), slice(None, None, None), slice(2, 4,
> None)]==a[1:3,:,2:4]
> Out[121]:
> array([[[ True,  True],
> [ True,  True],
> [ True,  True],
> [ True,  True]],
>
>[[ True,  True],
> [ True,  True],
> [ True,  True],
> [ True,  True]]], dtype=bool)
>
> And problems happens when I want to retrieve a single number along a given
> dimension:
> because it treats 1:3,:,4 as 1:3,:,:4, as shown below:
>
> In [122]: convert_string_to_slice('1:3,:,4')
> Out[122]: [slice(1, 3, None), slice(None, None, None), slice(None, 4,
> None)]
>
> In [123]: a[1:3,:,4]
> Out[123]:
> array([[24, 29, 34, 39],
>[44, 49, 54, 59]])
>
> In [124]: a[slice(1, 3, None), slice(None, None, None), slice(None, 4,
> None)]
> Out[124]:
> array([[[20, 21, 22, 23],
> [25, 26, 27, 28],
> [30, 31, 32, 33],
> [35, 36, 37, 38]],
>
>[[40, 41, 42, 43],
> [45, 46, 47, 48],
> [50, 51, 52, 53],
> [55, 56, 57, 58]]])
>
>
> Then I have a function:
>
> #this function retrieves data from ndarray a by specifying slice_string:
> def retrieve_data(a,slice_string):
> slice_list=convert_string_to_slice(slice_string)
> return a[*slice_list]
>
> In the list line of the fuction "retrieve_data" I have problem, I get an
> invalid syntax error.
>
> return a[*slice_list]
>  ^
> SyntaxError: invalid syntax
>
> I hope it's not too long, please comment as you like. Thanks a lot
>
> Chao


I won't comment on the wisdom of your approach, but for you very last part,
don't try unpacking the slice list.  Also, I think it has to be a tuple,
but I could be wrong on that.

Ben Root

>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] numpy.complex

2012-07-23 Thread Benjamin Root
On Monday, July 23, 2012, OC wrote:

>  > It's unPythonic just in the sense that it is unlike every other type
>  > constructor in Python. int(x) returns an int, list(x) returns a list,
>  > but np.complex64(x) sometimes returns a np.complex64, and sometimes it
>  > returns a np.ndarray, depending on what 'x' is.
>
> This "object factory" design pattern adds useful and natural functionality.
>
>  > I can see an argument for deprecating this behaviour altogether and
>  > referring people to the np.asarray(x, dtype=complex) form; that would
>  > be cleaner and reduce confusion. Don't know if it's worth it, but
>  > that's the only cleanup that I can see even being considered for these
>  > constructors.
>
>  From my experience in teaching, I can tell that even beginners have no
> problem with the fact that "complex128(1)" returns a scalar and that
> "complex128(r_[1])" returns an array. It seems to be pretty natural.
>
> Also, from the duck-typing point of view, both returned values are
> complex, i.e. provide 'real' and 'imag' attributes and 'conjugate()'
> method.
>
> On the contrary a real confusion is with "numpy.complex" acting
> differently than the other "numpy.complex*".
>
>  > People do write "from numpy import *"
>
> Yeah, that's what I do very often in interactive "ipython" sessions.
> Other than this, people are warned often enough that this shouldn't be
> used in real programs.



Don't be so sure of that.  The "pylab" mode from matplotlib has been both a
blessing and a curse.  This mode is very popular and for many, "it is all
they need/want to know".  While it has made the transition from other
languages easier for many, the polluted namespace comes at a small cost.

And it is only going to get worse when moving over to py3k where just about
everything is a generator.  __builtin__.any can handle generators, but
np.any does not.  Same goes for several other functions.

Note, I do agree with you that the discrepancy needs to be fixed, I just am
not sure which way.

Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Synonym standards

2012-07-26 Thread Benjamin Root
On Thu, Jul 26, 2012 at 4:45 PM, Colin J. Williams  wrote:

>  It seems that these standards have been adopted, which is good:
>
> The following import conventions are used throughout the NumPy source and
> documentation:
>
> import numpy as np
> import matplotlib as mpl
> import matplotlib.pyplot as plt
>
> Source: https://github.com/numpy/numpy/blob/master/doc/HOWTO_DOCUMENT.rst.txt
>
>  Is there some similar standard for PyLab?
>
> Thanks,
>
> Colin W.
>
>

Colin,

Typically, with pylab mode of matplotlib, you do:

from pylab import *

This is essentially equivalent to:

from numpy import *
from matplotlib.pyplot import *

Note that the pylab "module" is actually a part of matplotlib and is a
shortcut to provide an environment that is very familiar to Matlab users.
Converts are then encouraged to use the imports you mentioned in order to
properly utilize python namespaces.

I hope that helps!
Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Synonym standards

2012-07-27 Thread Benjamin Root
On Thu, Jul 26, 2012 at 7:12 PM, Robert Kern  wrote:

> On Fri, Jul 27, 2012 at 12:05 AM, Colin J. Williams
>  wrote:
> > On 26/07/2012 4:57 PM, Benjamin Root wrote:
> >
> >
> > On Thu, Jul 26, 2012 at 4:45 PM, Colin J. Williams  wrote:
> >>
> >> It seems that these standards have been adopted, which is good:
> >>
> >> The following import conventions are used throughout the NumPy source
> and
> >> documentation:
> >>
> >> import numpy as np
> >> import matplotlib as mpl
> >> import matplotlib.pyplot as plt
> >>
> >> Source:
> >> https://github.com/numpy/numpy/blob/master/doc/HOWTO_DOCUMENT.rst.txt
> >>
> >> Is there some similar standard for PyLab?
> >>
> >> Thanks,
> >>
> >> Colin W.
> >>
> >
> >
> > Colin,
> >
> > Typically, with pylab mode of matplotlib, you do:
> >
> > from pylab import *
> >
> > This is essentially equivalent to:
> >
> > from numpy import *
> > from matplotlib.pyplot import *
> >
> > Note that the pylab "module" is actually a part of matplotlib and is a
> > shortcut to provide an environment that is very familiar to Matlab users.
> > Converts are then encouraged to use the imports you mentioned in order to
> > properly utilize python namespaces.
> >
> > I hope that helps!
> > Ben Root
> >
> >
> >
> > ___
> > NumPy-Discussion mailing list
> > NumPy-Discussion@scipy.org
> > http://mail.scipy.org/mailman/listinfo/numpy-discussion
> >
> > Thanks Ben,
> >
> > I would prefer not to use:  from xxx import *,
> >
> > because of the name pollution.
> >
> > The name  convention that I copied above facilitates avoiding the
> pollution.
> >
> > In the same spirit, I've used:
> > import pylab as plb
>
> But in that same spirit, using np and plt separately is preferred.
>
>
"Namespaces are one honking great idea -- let's do more of those!"
from http://www.python.org/dev/peps/pep-0020/

Absolutely correct.  The namespace pollution is exactly why we encourage
converts to move over from the pylab mode to separating out the numpy and
pyplot namespaces.  There are very subtle issues that arise when doing
"from pylab import *" such as overriding the built-in "any" and "all".  The
only real advantage of the pylab mode over separating out numpy and pyplot
is conciseness, which many matlab users expect at first.

Cheers!
Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] bug in numpy.where?

2012-07-27 Thread Benjamin Root
On Thu, Jul 26, 2012 at 2:33 PM, Phil Hodge  wrote:

> On a Linux machine:
>
>  > uname -srvop
> Linux 2.6.18-308.8.2.el5 #1 SMP Tue May 29 11:54:17 EDT 2012 x86_64
> GNU/Linux
>
> this example shows an apparent problem with the where function:
>
> Python 2.7.1 (r271:86832, Dec 21 2010, 11:19:43)
> [GCC 4.1.2 20080704 (Red Hat 4.1.2-48)] on linux2
> Type "help", "copyright", "credits" or "license" for more information.
>  >>> import numpy as np
>  >>> print np.__version__
> 1.5.1
>  >>> net = np.zeros(3, dtype='>f4')
>  >>> net[1] = 0.00458849
>  >>> net[2] = 0.605202
>  >>> max_net = net.max()
>  >>> test = np.where(net <= 0., max_net, net)
>  >>> print test
> [ -2.23910537e-35   4.58848989e-03   6.05202019e-01]
>
> When I specified the dtype for net as '>f8', test[0] was
> 3.46244974e+68.  It worked as expected (i.e. test[0] should be 0.605202)
> when I specified float(max_net) as the second argument to np.where.
>
> Phil
>

Confirmed with version 1.7.0.dev-470c857 on a CentOS6 64-bit machine.
Strange indeed.

Breaking it down further:

>>> res = (net <= 0.)
>>> print res
[ True False False]
>>> np.where(res, max_net, net)
array([ -2.23910537e-35,   4.58848989e-03,   6.05202019e-01], dtype=float32)

Very Strange...

Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] bug in numpy.where?

2012-07-27 Thread Benjamin Root
On Fri, Jul 27, 2012 at 3:58 PM, Andreas Mueller
wrote:

> Hi Everybody.
> The bug is that no error is raised, right?
> The docs say
>
> where(condition, [x, y])
>
> x, y : array_like, optional
>  Values from which to choose. `x` and `y` need to have the same
>  shape as `condition`
>
> In the example you gave, x was a scalar.
>
> Cheers,
> Andy
>

Hmm, that is incorrect, I believe.  I have used a scalar before.  Maybe it
works because a scalar is broadcastable to the same shape as any other
N-dim array?

If so, then the wording of that docstring needs to be fixed.

No, I think Christopher hit it on the head.  For whatever reason, the
endian-ness somewhere is not being respected and causes a byte-swapped
version to show up.  How that happens, though, is beyond me.

Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] ANN: NumPy 1.7.0b1 release

2012-08-23 Thread Benjamin Root
On Tue, Aug 21, 2012 at 12:24 PM, Ondřej Čertík wrote:

> Hi,
>
> I'm pleased to announce the availability of the first beta release of
> NumPy 1.7.0b1.
>
> Sources and binary installers can be found at
> https://sourceforge.net/projects/numpy/files/NumPy/1.7.0b1/
>
> Please test this release and report any issues on the numpy-discussion
> mailing list. The following problems are known and
> we'll work on fixing them before the final release:
>
> http://projects.scipy.org/numpy/ticket/2187
> http://projects.scipy.org/numpy/ticket/2185
> http://projects.scipy.org/numpy/ticket/2066
> http://projects.scipy.org/numpy/ticket/1588
> http://projects.scipy.org/numpy/ticket/2076
> http://projects.scipy.org/numpy/ticket/2101
> http://projects.scipy.org/numpy/ticket/2108
> http://projects.scipy.org/numpy/ticket/2150
> http://projects.scipy.org/numpy/ticket/2189
>
> I would like to thank Ralf for a lot of help with creating binaries
> and other help for this release.
>
> Cheers,
> Ondrej
>
>
>
At http://docs.scipy.org/doc/numpy/contents.html, it looks like the TOC
tree is a bit messed up.  For example, I see that masked arrays are listed
multiple times, and I think some of the sub-entries for masked arrays show
up multiple times within an entry for masked arrays.  Some of the bullets
appear as ">" instead of dots.

Don't know what version that page is generated from, but we might want to
double-check that 1.7.0's docs don't have the same problem.

Cheers!
Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] broadcasting question

2012-08-30 Thread Benjamin Root
On Thursday, August 30, 2012, Neal Becker wrote:

> I think this should be simple, but I'm drawing a blank
>
> I have 2 2d matrixes
>
> Matrix A has indexes (i, symbol)
> Matrix B has indexes (state, symbol)
>
> I combined them into a 3d matrix:
>
> C = A[:,newaxis,:] + B[newaxis,:,:]
> where C has indexes (i, state, symbol)
>
> That works fine.
>
> Now suppose I want to omit B (for debug), like:
>
> C = A[:,newaxis,:]
>
> In other words, all I want is to add a dimension into A and force it to
> broadcast along that axis.  How do I do that?
>
>
np.tile would help you there, I think.


Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] numpy.ma.MaskedArray.min() makes a copy?

2012-09-07 Thread Benjamin Root
An issue just reported on the matplotlib-users list involved a user who ran
out of memory while attempting to do an imshow() on a large array.  While
this wouldn't be totally unexpected, the user's traceback shows that they
ran out of memory before any actual building of the image occurred.  Memory
usage sky-rocketed when imshow() attempted to determine the min and max of
the image.  The input data was a masked array, and it appears that the
implementation of min() for masked arrays goes something like this
(paraphrasing here):

obj.filled(inf).min()

The idea is that any masked element is set to the largest possible value
for their dtype in a copied array of itself, and then a min() is performed
on that copied array.  I am assuming that max() does the same thing.

Can this be done differently/more efficiently?  If the "filled" approach
has to be done, maybe it would be a good idea to make the copy in chunks
instead of all at once?  Ideally, it would be nice to avoid the copying
altogether and utilize some of the special iterators that Mark Weibe
created last year.

Cheers!
Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] Regression: in-place operations (possibly intentional)

2012-09-17 Thread Benjamin Root
Consider the following code:

import numpy as np
a = np.array([1, 2, 3, 4, 5], dtype=np.int16)
a *= float(255) / 15

In v1.6.x, this yields:
array([17, 34, 51, 68, 85], dtype=int16)

But in master, this throws an exception about failing to cast via same_kind.

Note that numpy was smart about this operation before, consider:
a = np.array([1, 2, 3, 4, 5], dtype=np.int16)
a *= float(128) / 256

yields:
array([0, 1, 1, 2, 2], dtype=int16)

Of course, this is different than if one does it in a non-in-place manner:
np.array([1, 2, 3, 4, 5], dtype=np.int16) * 0.5

which yields an array with floating point dtype in both versions.  I can
appreciate the arguments for preventing this kind of implicit casting
between non-same_kind dtypes, but I argue that because the operation is
in-place, then I (as the programmer) am explicitly stating that I desire to
utilize the current array to store the results of the operation, dtype and
all.  Obviously, we can't completely turn off this rule (for example, an
in-place addition between integer array and a datetime64 makes no sense),
but surely there is some sort of happy medium that would allow these sort
of operations to take place?

Lastly, if it is determined that it is desirable to allow in-place
operations to continue working like they have before, I would like to see
such a fix in v1.7 because if it isn't in 1.7, then other libraries (such
as matplotlib, where this issue was first found) would have to change their
code anyway just to be compatible with numpy.

Cheers!
Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Regression: in-place operations (possibly intentional)

2012-09-18 Thread Benjamin Root
On Mon, Sep 17, 2012 at 9:33 PM, Charles R Harris  wrote:

>
>
> On Mon, Sep 17, 2012 at 3:40 PM, Travis Oliphant wrote:
>
>>
>> On Sep 17, 2012, at 8:42 AM, Benjamin Root wrote:
>>
>> > Consider the following code:
>> >
>> > import numpy as np
>> > a = np.array([1, 2, 3, 4, 5], dtype=np.int16)
>> > a *= float(255) / 15
>> >
>> > In v1.6.x, this yields:
>> > array([17, 34, 51, 68, 85], dtype=int16)
>> >
>> > But in master, this throws an exception about failing to cast via
>> same_kind.
>> >
>> > Note that numpy was smart about this operation before, consider:
>> > a = np.array([1, 2, 3, 4, 5], dtype=np.int16)
>> > a *= float(128) / 256
>>
>> > yields:
>> > array([0, 1, 1, 2, 2], dtype=int16)
>> >
>> > Of course, this is different than if one does it in a non-in-place
>> manner:
>> > np.array([1, 2, 3, 4, 5], dtype=np.int16) * 0.5
>> >
>> > which yields an array with floating point dtype in both versions.  I
>> can appreciate the arguments for preventing this kind of implicit casting
>> between non-same_kind dtypes, but I argue that because the operation is
>> in-place, then I (as the programmer) am explicitly stating that I desire to
>> utilize the current array to store the results of the operation, dtype and
>> all.  Obviously, we can't completely turn off this rule (for example, an
>> in-place addition between integer array and a datetime64 makes no sense),
>> but surely there is some sort of happy medium that would allow these sort
>> of operations to take place?
>> >
>> > Lastly, if it is determined that it is desirable to allow in-place
>> operations to continue working like they have before, I would like to see
>> such a fix in v1.7 because if it isn't in 1.7, then other libraries (such
>> as matplotlib, where this issue was first found) would have to change their
>> code anyway just to be compatible with numpy.
>>
>> I agree that in-place operations should allow different casting rules.
>>  There are different opinions on this, of course, but generally this is how
>> NumPy has worked in the past.
>>
>> We did decide to change the default casting rule to "same_kind" but
>> making an exception for in-place seems reasonable.
>>
>
> I think that in these cases same_kind will flag what are most likely
> programming errors and sloppy code. It is easy to be explicit and doing so
> will make the code more readable because it will be immediately obvious
> what the multiplicand is without the need to recall what the numpy casting
> rules are in this exceptional case. IISTR several mentions of this before
> (Gael?), and in some of those cases it turned out that bugs were being
> turned up. Catching bugs with minimal effort is a good thing.
>
> Chuck
>
>
True, it is quite likely to be a programming error, but then again, there
are many cases where it isn't.  Is the problem strictly that we are trying
to downcast the float to an int, or is it that we are trying to downcast to
a lower precision?  Is there a way for one to explicitly relax the
same_kind restriction?

Thanks,
Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] numpy.ma.MaskedArray.min() makes a copy?

2012-09-18 Thread Benjamin Root
On Fri, Sep 7, 2012 at 12:05 PM, Nathaniel Smith  wrote:

> On 7 Sep 2012 14:38, "Benjamin Root"  wrote:
> >
> > An issue just reported on the matplotlib-users list involved a user who
> ran out of memory while attempting to do an imshow() on a large array.
> While this wouldn't be totally unexpected, the user's traceback shows that
> they ran out of memory before any actual building of the image occurred.
> Memory usage sky-rocketed when imshow() attempted to determine the min and
> max of the image.  The input data was a masked array, and it appears that
> the implementation of min() for masked arrays goes something like this
> (paraphrasing here):
> >
> > obj.filled(inf).min()
> >
> > The idea is that any masked element is set to the largest possible value
> for their dtype in a copied array of itself, and then a min() is performed
> on that copied array.  I am assuming that max() does the same thing.
> >
> > Can this be done differently/more efficiently?  If the "filled" approach
> has to be done, maybe it would be a good idea to make the copy in chunks
> instead of all at once?  Ideally, it would be nice to avoid the copying
> altogether and utilize some of the special iterators that Mark Weibe
> created last year.
>
> I think what you're looking for is where= support for ufunc.reduce. This
> isn't implemented yet but at least it's straightforward in principle...
> otherwise I don't know anything better than reimplementing .min() by hand.
>
> -n
>
>
Yes, it was the where= support that I was thinking of.  I take it that it
was pulled out of the 1.7 branch with the rest of the NA stuff?

Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Regression: in-place operations (possibly intentional)

2012-09-18 Thread Benjamin Root
On Tue, Sep 18, 2012 at 2:47 PM, Charles R Harris  wrote:

>
>
> On Tue, Sep 18, 2012 at 11:39 AM, Benjamin Root  wrote:
>
>>
>>
>> On Mon, Sep 17, 2012 at 9:33 PM, Charles R Harris <
>> charlesr.har...@gmail.com> wrote:
>>
>>>
>>>
>>> On Mon, Sep 17, 2012 at 3:40 PM, Travis Oliphant wrote:
>>>
>>>>
>>>> On Sep 17, 2012, at 8:42 AM, Benjamin Root wrote:
>>>>
>>>> > Consider the following code:
>>>> >
>>>> > import numpy as np
>>>> > a = np.array([1, 2, 3, 4, 5], dtype=np.int16)
>>>> > a *= float(255) / 15
>>>> >
>>>> > In v1.6.x, this yields:
>>>> > array([17, 34, 51, 68, 85], dtype=int16)
>>>> >
>>>> > But in master, this throws an exception about failing to cast via
>>>> same_kind.
>>>> >
>>>> > Note that numpy was smart about this operation before, consider:
>>>> > a = np.array([1, 2, 3, 4, 5], dtype=np.int16)
>>>> > a *= float(128) / 256
>>>>
>>>> > yields:
>>>> > array([0, 1, 1, 2, 2], dtype=int16)
>>>> >
>>>> > Of course, this is different than if one does it in a non-in-place
>>>> manner:
>>>> > np.array([1, 2, 3, 4, 5], dtype=np.int16) * 0.5
>>>> >
>>>> > which yields an array with floating point dtype in both versions.  I
>>>> can appreciate the arguments for preventing this kind of implicit casting
>>>> between non-same_kind dtypes, but I argue that because the operation is
>>>> in-place, then I (as the programmer) am explicitly stating that I desire to
>>>> utilize the current array to store the results of the operation, dtype and
>>>> all.  Obviously, we can't completely turn off this rule (for example, an
>>>> in-place addition between integer array and a datetime64 makes no sense),
>>>> but surely there is some sort of happy medium that would allow these sort
>>>> of operations to take place?
>>>> >
>>>> > Lastly, if it is determined that it is desirable to allow in-place
>>>> operations to continue working like they have before, I would like to see
>>>> such a fix in v1.7 because if it isn't in 1.7, then other libraries (such
>>>> as matplotlib, where this issue was first found) would have to change their
>>>> code anyway just to be compatible with numpy.
>>>>
>>>> I agree that in-place operations should allow different casting rules.
>>>>  There are different opinions on this, of course, but generally this is how
>>>> NumPy has worked in the past.
>>>>
>>>> We did decide to change the default casting rule to "same_kind" but
>>>> making an exception for in-place seems reasonable.
>>>>
>>>
>>> I think that in these cases same_kind will flag what are most likely
>>> programming errors and sloppy code. It is easy to be explicit and doing so
>>> will make the code more readable because it will be immediately obvious
>>> what the multiplicand is without the need to recall what the numpy casting
>>> rules are in this exceptional case. IISTR several mentions of this before
>>> (Gael?), and in some of those cases it turned out that bugs were being
>>> turned up. Catching bugs with minimal effort is a good thing.
>>>
>>> Chuck
>>>
>>>
>> True, it is quite likely to be a programming error, but then again, there
>> are many cases where it isn't.  Is the problem strictly that we are trying
>> to downcast the float to an int, or is it that we are trying to downcast to
>> a lower precision?  Is there a way for one to explicitly relax the
>> same_kind restriction?
>>
>
> I think the problem is down casting across kinds, with the result that
> floats are truncated and the imaginary parts of imaginaries might be
> discarded. That is, the value, not just the precision, of the rhs changes.
> So I'd favor an explicit cast in code like this, i.e., cast the rhs to an
> integer.
>
> It is true that this forces downstream to code up to a higher standard,
> but I don't see that as a bad thing, especially if it exposes bugs. And it
> isn't difficult to fix.
>
> Chuck
>
>
Mind you, in my case, casting the rhs as an integer before doing the
multiplication would be a bug, since our value for the rhs is usually
between zero and one.  Multiplying first by the integer numerator before
dividing by the integer denominator would likely cause issues with
overflowing the 16 bit integer.

Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Regression: in-place operations (possibly intentional)

2012-09-18 Thread Benjamin Root
On Tue, Sep 18, 2012 at 3:19 PM, Ralf Gommers wrote:

>
>
> On Tue, Sep 18, 2012 at 9:13 PM, Benjamin Root  wrote:
>
>>
>>
>> On Tue, Sep 18, 2012 at 2:47 PM, Charles R Harris <
>> charlesr.har...@gmail.com> wrote:
>>
>>>
>>>
>>> On Tue, Sep 18, 2012 at 11:39 AM, Benjamin Root  wrote:
>>>
>>>>
>>>>
>>>> On Mon, Sep 17, 2012 at 9:33 PM, Charles R Harris <
>>>> charlesr.har...@gmail.com> wrote:
>>>>
>>>>>
>>>>>
>>>>> On Mon, Sep 17, 2012 at 3:40 PM, Travis Oliphant 
>>>>> wrote:
>>>>>
>>>>>>
>>>>>> On Sep 17, 2012, at 8:42 AM, Benjamin Root wrote:
>>>>>>
>>>>>> > Consider the following code:
>>>>>> >
>>>>>> > import numpy as np
>>>>>> > a = np.array([1, 2, 3, 4, 5], dtype=np.int16)
>>>>>> > a *= float(255) / 15
>>>>>> >
>>>>>> > In v1.6.x, this yields:
>>>>>> > array([17, 34, 51, 68, 85], dtype=int16)
>>>>>> >
>>>>>> > But in master, this throws an exception about failing to cast via
>>>>>> same_kind.
>>>>>> >
>>>>>> > Note that numpy was smart about this operation before, consider:
>>>>>> > a = np.array([1, 2, 3, 4, 5], dtype=np.int16)
>>>>>> > a *= float(128) / 256
>>>>>>
>>>>>> > yields:
>>>>>> > array([0, 1, 1, 2, 2], dtype=int16)
>>>>>> >
>>>>>> > Of course, this is different than if one does it in a non-in-place
>>>>>> manner:
>>>>>> > np.array([1, 2, 3, 4, 5], dtype=np.int16) * 0.5
>>>>>> >
>>>>>> > which yields an array with floating point dtype in both versions.
>>>>>>  I can appreciate the arguments for preventing this kind of implicit
>>>>>> casting between non-same_kind dtypes, but I argue that because the
>>>>>> operation is in-place, then I (as the programmer) am explicitly stating
>>>>>> that I desire to utilize the current array to store the results of the
>>>>>> operation, dtype and all.  Obviously, we can't completely turn off this
>>>>>> rule (for example, an in-place addition between integer array and a
>>>>>> datetime64 makes no sense), but surely there is some sort of happy medium
>>>>>> that would allow these sort of operations to take place?
>>>>>> >
>>>>>> > Lastly, if it is determined that it is desirable to allow in-place
>>>>>> operations to continue working like they have before, I would like to see
>>>>>> such a fix in v1.7 because if it isn't in 1.7, then other libraries (such
>>>>>> as matplotlib, where this issue was first found) would have to change 
>>>>>> their
>>>>>> code anyway just to be compatible with numpy.
>>>>>>
>>>>>> I agree that in-place operations should allow different casting
>>>>>> rules.  There are different opinions on this, of course, but generally 
>>>>>> this
>>>>>> is how NumPy has worked in the past.
>>>>>>
>>>>>> We did decide to change the default casting rule to "same_kind" but
>>>>>> making an exception for in-place seems reasonable.
>>>>>>
>>>>>
>>>>> I think that in these cases same_kind will flag what are most likely
>>>>> programming errors and sloppy code. It is easy to be explicit and doing so
>>>>> will make the code more readable because it will be immediately obvious
>>>>> what the multiplicand is without the need to recall what the numpy casting
>>>>> rules are in this exceptional case. IISTR several mentions of this before
>>>>> (Gael?), and in some of those cases it turned out that bugs were being
>>>>> turned up. Catching bugs with minimal effort is a good thing.
>>>>>
>>>>> Chuck
>>>>>
>>>>>
>>>> True, it is quite likely to be a programming error, but then again,
>>>> there are many cases where it isn't.  Is the problem strictly that we are
>>>> trying to downcast the float to an int, or is it that we are trying to
>>>> downcast to a lower precision?  Is there a way for one to explicitly relax
>>>> the same_kind restriction?
>>>>
>>>
>>> I think the problem is down casting across kinds, with the result that
>>> floats are truncated and the imaginary parts of imaginaries might be
>>> discarded. That is, the value, not just the precision, of the rhs changes.
>>> So I'd favor an explicit cast in code like this, i.e., cast the rhs to an
>>> integer.
>>>
>>> It is true that this forces downstream to code up to a higher standard,
>>> but I don't see that as a bad thing, especially if it exposes bugs. And it
>>> isn't difficult to fix.
>>>
>>> Chuck
>>>
>>>
>> Mind you, in my case, casting the rhs as an integer before doing the
>> multiplication would be a bug, since our value for the rhs is usually
>> between zero and one.  Multiplying first by the integer numerator before
>> dividing by the integer denominator would likely cause issues with
>> overflowing the 16 bit integer.
>>
>
> Then you'd have to do
>
>
> >>> a = np.array([1, 2, 3, 4, 5], dtype=np.int16)
> >>> np.multiply(a, 0.5, out=a, casting="unsafe")
>
> array([0, 1, 1, 2, 2], dtype=int16)
>
> Ralf
>
>
That is exactly what I am looking for!  When did the "casting" kwarg come
about?  I am unfamiliar with it.

Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Regression: in-place operations (possibly intentional)

2012-09-18 Thread Benjamin Root
On Tue, Sep 18, 2012 at 3:25 PM, Charles R Harris  wrote:

>
>
> On Tue, Sep 18, 2012 at 1:13 PM, Benjamin Root  wrote:
>
>>
>>
>> On Tue, Sep 18, 2012 at 2:47 PM, Charles R Harris <
>> charlesr.har...@gmail.com> wrote:
>>
>>>
>>>
>>> On Tue, Sep 18, 2012 at 11:39 AM, Benjamin Root  wrote:
>>>
>>>>
>>>>
>>>> On Mon, Sep 17, 2012 at 9:33 PM, Charles R Harris <
>>>> charlesr.har...@gmail.com> wrote:
>>>>
>>>>>
>>>>>
>>>>> On Mon, Sep 17, 2012 at 3:40 PM, Travis Oliphant 
>>>>> wrote:
>>>>>
>>>>>>
>>>>>> On Sep 17, 2012, at 8:42 AM, Benjamin Root wrote:
>>>>>>
>>>>>> > Consider the following code:
>>>>>> >
>>>>>> > import numpy as np
>>>>>> > a = np.array([1, 2, 3, 4, 5], dtype=np.int16)
>>>>>> > a *= float(255) / 15
>>>>>> >
>>>>>> > In v1.6.x, this yields:
>>>>>> > array([17, 34, 51, 68, 85], dtype=int16)
>>>>>> >
>>>>>> > But in master, this throws an exception about failing to cast via
>>>>>> same_kind.
>>>>>> >
>>>>>> > Note that numpy was smart about this operation before, consider:
>>>>>> > a = np.array([1, 2, 3, 4, 5], dtype=np.int16)
>>>>>> > a *= float(128) / 256
>>>>>>
>>>>>> > yields:
>>>>>> > array([0, 1, 1, 2, 2], dtype=int16)
>>>>>> >
>>>>>> > Of course, this is different than if one does it in a non-in-place
>>>>>> manner:
>>>>>> > np.array([1, 2, 3, 4, 5], dtype=np.int16) * 0.5
>>>>>> >
>>>>>> > which yields an array with floating point dtype in both versions.
>>>>>>  I can appreciate the arguments for preventing this kind of implicit
>>>>>> casting between non-same_kind dtypes, but I argue that because the
>>>>>> operation is in-place, then I (as the programmer) am explicitly stating
>>>>>> that I desire to utilize the current array to store the results of the
>>>>>> operation, dtype and all.  Obviously, we can't completely turn off this
>>>>>> rule (for example, an in-place addition between integer array and a
>>>>>> datetime64 makes no sense), but surely there is some sort of happy medium
>>>>>> that would allow these sort of operations to take place?
>>>>>> >
>>>>>> > Lastly, if it is determined that it is desirable to allow in-place
>>>>>> operations to continue working like they have before, I would like to see
>>>>>> such a fix in v1.7 because if it isn't in 1.7, then other libraries (such
>>>>>> as matplotlib, where this issue was first found) would have to change 
>>>>>> their
>>>>>> code anyway just to be compatible with numpy.
>>>>>>
>>>>>> I agree that in-place operations should allow different casting
>>>>>> rules.  There are different opinions on this, of course, but generally 
>>>>>> this
>>>>>> is how NumPy has worked in the past.
>>>>>>
>>>>>> We did decide to change the default casting rule to "same_kind" but
>>>>>> making an exception for in-place seems reasonable.
>>>>>>
>>>>>
>>>>> I think that in these cases same_kind will flag what are most likely
>>>>> programming errors and sloppy code. It is easy to be explicit and doing so
>>>>> will make the code more readable because it will be immediately obvious
>>>>> what the multiplicand is without the need to recall what the numpy casting
>>>>> rules are in this exceptional case. IISTR several mentions of this before
>>>>> (Gael?), and in some of those cases it turned out that bugs were being
>>>>> turned up. Catching bugs with minimal effort is a good thing.
>>>>>
>>>>> Chuck
>>>>>
>>>>>
>>>> True, it is quite likely to be a programming error, but then again,
>>>> there are many cases where it isn't.  Is the problem strictly that we are
>>>> trying to downcast the float to an int, or is it that we 

Re: [Numpy-discussion] Regression: in-place operations (possibly intentional)

2012-09-18 Thread Benjamin Root
On Tue, Sep 18, 2012 at 4:42 PM, Charles R Harris  wrote:

>
>
> On Tue, Sep 18, 2012 at 2:33 PM, Travis Oliphant wrote:
>
>>
>> On Sep 18, 2012, at 2:44 PM, Charles R Harris wrote:
>>
>>
>>
>> On Tue, Sep 18, 2012 at 1:35 PM, Benjamin Root  wrote:
>>
>>>
>>>
>>> On Tue, Sep 18, 2012 at 3:25 PM, Charles R Harris <
>>> charlesr.har...@gmail.com> wrote:
>>>
>>>>
>>>>
>>>> On Tue, Sep 18, 2012 at 1:13 PM, Benjamin Root  wrote:
>>>>
>>>>>
>>>>>
>>>>> On Tue, Sep 18, 2012 at 2:47 PM, Charles R Harris <
>>>>> charlesr.har...@gmail.com> wrote:
>>>>>
>>>>>>
>>>>>>
>>>>>> On Tue, Sep 18, 2012 at 11:39 AM, Benjamin Root wrote:
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Mon, Sep 17, 2012 at 9:33 PM, Charles R Harris <
>>>>>>> charlesr.har...@gmail.com> wrote:
>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Mon, Sep 17, 2012 at 3:40 PM, Travis Oliphant <
>>>>>>>> tra...@continuum.io> wrote:
>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Sep 17, 2012, at 8:42 AM, Benjamin Root wrote:
>>>>>>>>>
>>>>>>>>> > Consider the following code:
>>>>>>>>> >
>>>>>>>>> > import numpy as np
>>>>>>>>> > a = np.array([1, 2, 3, 4, 5], dtype=np.int16)
>>>>>>>>> > a *= float(255) / 15
>>>>>>>>> >
>>>>>>>>> > In v1.6.x, this yields:
>>>>>>>>> > array([17, 34, 51, 68, 85], dtype=int16)
>>>>>>>>> >
>>>>>>>>> > But in master, this throws an exception about failing to cast
>>>>>>>>> via same_kind.
>>>>>>>>> >
>>>>>>>>> > Note that numpy was smart about this operation before, consider:
>>>>>>>>> > a = np.array([1, 2, 3, 4, 5], dtype=np.int16)
>>>>>>>>> > a *= float(128) / 256
>>>>>>>>>
>>>>>>>>> > yields:
>>>>>>>>> > array([0, 1, 1, 2, 2], dtype=int16)
>>>>>>>>> >
>>>>>>>>> > Of course, this is different than if one does it in a
>>>>>>>>> non-in-place manner:
>>>>>>>>> > np.array([1, 2, 3, 4, 5], dtype=np.int16) * 0.5
>>>>>>>>> >
>>>>>>>>> > which yields an array with floating point dtype in both
>>>>>>>>> versions.  I can appreciate the arguments for preventing this kind of
>>>>>>>>> implicit casting between non-same_kind dtypes, but I argue that 
>>>>>>>>> because the
>>>>>>>>> operation is in-place, then I (as the programmer) am explicitly 
>>>>>>>>> stating
>>>>>>>>> that I desire to utilize the current array to store the results of the
>>>>>>>>> operation, dtype and all.  Obviously, we can't completely turn off 
>>>>>>>>> this
>>>>>>>>> rule (for example, an in-place addition between integer array and a
>>>>>>>>> datetime64 makes no sense), but surely there is some sort of happy 
>>>>>>>>> medium
>>>>>>>>> that would allow these sort of operations to take place?
>>>>>>>>> >
>>>>>>>>> > Lastly, if it is determined that it is desirable to allow
>>>>>>>>> in-place operations to continue working like they have before, I 
>>>>>>>>> would like
>>>>>>>>> to see such a fix in v1.7 because if it isn't in 1.7, then other 
>>>>>>>>> libraries
>>>>>>>>> (such as matplotlib, where this issue was first found) would have to 
>>>>>>>>> change
>>>>>>>>> their code anyway just to be compatible with numpy.
>>>>>>>>>
>>>>>>>>> I agree that in-place operatio

Re: [Numpy-discussion] specifying numpy as dependency in your project, install_requires

2012-09-21 Thread Benjamin Root
On Fri, Sep 21, 2012 at 4:19 PM, Travis Oliphant wrote:

>
> On Sep 21, 2012, at 3:13 PM, Ralf Gommers wrote:
>
> Hi,
>
> An issue I keep running into is that packages use:
> install_requires = ["numpy"]
> or
> install_requires = ['numpy >= 1.6']
>
> in their setup.py. This simply doesn't work a lot of the time. I actually
> filed a bug against patsy for that (
> https://github.com/pydata/patsy/issues/5), but Nathaniel is right that it
> would be better to bring it up on this list.
>
> The problem is that if you use pip, it doesn't detect numpy (may work
> better if you had installed numpy with setuptools) and tries to
> automatically install or upgrade numpy. That won't work if users don't have
> the right compiler. Just as bad would be that it does work, and the user
> didn't want to upgrade for whatever reason.
>
> This isn't just my problem; at Wes' pandas tutorial at EuroScipy I saw
> other people have the exact same problem. My recommendation would be to not
> use install_requires for numpy, but simply do something like this in
> setup.py:
>
> try:
> import numpy
> except ImportError:
> raise ImportError("my_package requires numpy")
>
> or
>
> try:
> from numpy.version import short_version as npversion
> except ImportError:
> raise ImportError("my_package requires numpy")
> if npversion < '1.6':
>raise ImportError("Numpy version is %s; required is version >= 1.6"
> % npversion)
>
> Any objections, better ideas? Is there a good place to put it in the numpy
> docs somewhere?
>
>
> I agree.   I would recommend against using install requires.
>
> -Travis
>
>
>
Why?  I have personally never had an issue with this.  The only way I could
imagine that this wouldn't work is if numpy was installed via some other
means and there wasn't an entry in the easy-install.pth (or whatever
equivalent pip uses).  If pip is having a problem detecting numpy, then
that is a bug that needs fixing somewhere.

As for packages getting updated unintentionally, easy_install and pip both
require an argument to upgrade any existing packages (I think -U), so I am
not sure how you are running into such a situation.

I have found install_requires to be a powerful feature in my setup.py
scripts, and I have seen no reason to discourage it.  Perhaps I am the only
one?

Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] how to pipe into numpy arrays?

2012-10-24 Thread Benjamin Root
On Wed, Oct 24, 2012 at 3:00 PM, Michael Aye  wrote:

> As numpy.fromfile seems to require full file object functionalities
> like seek, I can not use it with the sys.stdin pipe.
> So how could I stream a binary pipe directly into numpy?
> I can imagine storing the data in a string and use StringIO but the
> files are 3.6 GB large, just the binary, and that will most likely be
> much more as a string object.
> Reading binary files on disk is NOT the problem, I would like to avoid
> the temporary file if possible.
>
>
I haven't tried this myself, but there is a numpy.frombuffer() function as
well.  Maybe that could be used here?

Cheers!
Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] Regression in mpl: AttributeError: incompatible shape for a non-contiguous array

2012-10-29 Thread Benjamin Root
This error started showing up in the test suite for mpl when using numpy
master.

AttributeError: incompatible shape for a non-contiguous array

The tracebacks all point back to various code points where we are trying to
set the shape of an array, e.g.,

offsets.shape = (-1, 2)

Those lines haven't changed in a couple of years, and was intended to be
done this way to raise an error when reshaping would result in a copy
(since we needed to use the original in those places).  I don't know how
these arrays have become non-contiguous, so I am wondering if there was
some sort of attribute that got screwed up somewhere (maybe with views?)

Cheers!
Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Regression in mpl: AttributeError: incompatible shape for a non-contiguous array

2012-10-29 Thread Benjamin Root
On Mon, Oct 29, 2012 at 10:33 AM, Sebastian Berg  wrote:

> Hey,
>
> On Mon, 2012-10-29 at 09:54 -0400, Benjamin Root wrote:
> > This error started showing up in the test suite for mpl when using
> > numpy master.
> >
> > AttributeError: incompatible shape for a non-contiguous array
> >
> > The tracebacks all point back to various code points where we are
> > trying to set the shape of an array, e.g.,
> >
> > offsets.shape = (-1, 2)
> >
> Could you give a hint what these arrays history (how it was created) and
> maybe .shape/.strides is? Sounds like the array is not contiguous when
> it is expected to be, or the attribute setting itself fails in some
> corner cases on master?
>
> Regards,
>
> Sebastian
>
>
The original reporter of the bug dug into the commit list and suspects it
was this one:

https://github.com/numpy/numpy/commit/02ebf8b3e7674a6b8a06636feaa6c761fcdf4e2d

However, it might be earlier than that (he is currently doing a clean
rebuild to make sure).

As for the history:

offsets = np.asanyarray(offsets)
offsets.shape = (-1, 2) # Make it Nx2

Where "offsets" comes in from (possibly) user-supplied data.  Nothing
really all that special.  I will see if I can get stride information.

Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Regression in mpl: AttributeError: incompatible shape for a non-contiguous array

2012-10-29 Thread Benjamin Root
On Mon, Oct 29, 2012 at 11:04 AM, Patrick Marsh wrote:

> Turns out it isn't the commit I thought it was. I'm currently going
> through a git bisect to track down the actual commit that introduced this
> bug. I'll post back when I've found it.
>
>
>  PTM
> ---
> Patrick Marsh
> Ph.D. Candidate / Liaison to the HWT
> School of Meteorology / University of Oklahoma
> Cooperative Institute for Mesoscale Meteorological Studies
> National Severe Storms Laboratory
> http://www.patricktmarsh.com
>
>
>
> On Mon, Oct 29, 2012 at 9:43 AM, Benjamin Root  wrote:
>
>>
>>
>> On Mon, Oct 29, 2012 at 10:33 AM, Sebastian Berg <
>> sebast...@sipsolutions.net> wrote:
>>
>>> Hey,
>>>
>>> On Mon, 2012-10-29 at 09:54 -0400, Benjamin Root wrote:
>>> > This error started showing up in the test suite for mpl when using
>>> > numpy master.
>>> >
>>> > AttributeError: incompatible shape for a non-contiguous array
>>> >
>>> > The tracebacks all point back to various code points where we are
>>> > trying to set the shape of an array, e.g.,
>>> >
>>> > offsets.shape = (-1, 2)
>>> >
>>> Could you give a hint what these arrays history (how it was created) and
>>> maybe .shape/.strides is? Sounds like the array is not contiguous when
>>> it is expected to be, or the attribute setting itself fails in some
>>> corner cases on master?
>>>
>>> Regards,
>>>
>>> Sebastian
>>>
>>>
>> The original reporter of the bug dug into the commit list and suspects it
>> was this one:
>>
>>
>> https://github.com/numpy/numpy/commit/02ebf8b3e7674a6b8a06636feaa6c761fcdf4e2d
>>
>> However, it might be earlier than that (he is currently doing a clean
>> rebuild to make sure).
>>
>> As for the history:
>>
>> offsets = np.asanyarray(offsets)
>> offsets.shape = (-1, 2) # Make it Nx2
>>
>> Where "offsets" comes in from (possibly) user-supplied data.  Nothing
>> really all that special.  I will see if I can get stride information.
>>
>> Ben Root
>>
>>
Further digging reveals that the code fails when the array is originally
1-D.  I had an array with shape (2,) and stride (8,).  The reshaping should
result in a shape of (1, 2).

Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Simple question about scatter plot graph

2012-10-31 Thread Benjamin Root
On Wednesday, October 31, 2012, wrote:

> On Wed, Oct 31, 2012 at 8:59 PM, klo uo >
> wrote:
> > Thanks for your reply
> >
> > I suppose, variable length signals are split on equal parts and dominant
> > harmonic is extracted. Then scatter plot shows this pattern, which has
> some
> > low correlation, but I can't abstract what could be concluded from grid
> > pattern, as I lack statistical knowledge.
> > Maybe it's saying that data is quantized, which can't be easily seen from
> > single sample bar chart, but perhaps scatter plot suggests that? That's
> only
> > my wild guess
>
> http://pandasplotting.blogspot.ca/2012/06/lag-plot.html
> In general you would see a lag autocorrelation structure in the plot.
>
> My guess is that even if there is a pattern in your data we might not
> see it because we don't see plots that are plotted on top of each
> other. We only see the support of the y_t, y_{t+1} transition (points
> that are at least once in the sample), but not the frequencies (or
> conditional distribution).
>
> If that's the case, then
> reduce alpha level so many points on top of each other are darker, or
> colorcode the histogram for each y_t: bincount for each y_t and
> normalize, or use np.histogram directly for each y_t, then assign to
> each point a colorscale depending on it's frequency.
>
> Did you calculate the correlation? (But maybe linear correlation won't
> show much.)
>
> Josef


The answer is hexbin() in matplotlib when you have many points laying on or
near each other.

Cheers!
Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Do we want scalar casting to behave as it does at the moment?

2012-11-12 Thread Benjamin Root
On Monday, November 12, 2012, Olivier Delalleau wrote:

> 2012/11/12 Nathaniel Smith  'n...@pobox.com');>>
>
>> On Mon, Nov 12, 2012 at 8:54 PM, Matthew Brett 
>> > 'matthew.br...@gmail.com');>>
>> wrote:
>> > Hi,
>> >
>> > I wanted to check that everyone knows about and is happy with the
>> > scalar casting changes from 1.6.0.
>> >
>> > Specifically, the rules for (array, scalar) casting have changed such
>> > that the resulting dtype depends on the _value_ of the scalar.
>> >
>> > Mark W has documented these changes here:
>> >
>> > http://docs.scipy.org/doc/numpy/reference/ufuncs.html#casting-rules
>> >
>> http://docs.scipy.org/doc/numpy/reference/generated/numpy.result_type.html
>> >
>> http://docs.scipy.org/doc/numpy/reference/generated/numpy.promote_types.html
>> >
>> > Specifically, as of 1.6.0:
>> >
>> > In [19]: arr = np.array([1.], dtype=np.float32)
>> >
>> > In [20]: (arr + (2**16-1)).dtype
>> > Out[20]: dtype('float32')
>> >
>> > In [21]: (arr + (2**16)).dtype
>> > Out[21]: dtype('float64')
>> >
>> > In [25]: arr = np.array([1.], dtype=np.int8)
>> >
>> > In [26]: (arr + 127).dtype
>> > Out[26]: dtype('int8')
>> >
>> > In [27]: (arr + 128).dtype
>> > Out[27]: dtype('int16')
>> >
>> > There's discussion about the changes here:
>> >
>> >
>> http://mail.scipy.org/pipermail/numpy-discussion/2011-September/058563.html
>> > http://mail.scipy.org/pipermail/numpy-discussion/2011-March/055156.html
>> >
>> http://mail.scipy.org/pipermail/numpy-discussion/2012-February/060381.html
>> >
>> > It seems to me that this change is hard to explain, and does what you
>> > want only some of the time, making it a false friend.
>>
>> The old behaviour was that in these cases, the scalar was always cast
>> to the type of the array, right? So
>>   np.array([1], dtype=np.int8) + 256
>> returned 1? Is that the behaviour you prefer?
>>
>> I agree that the 1.6 behaviour is surprising and somewhat
>> inconsistent. There are many places where you can get an overflow in
>> numpy, and in all the other cases we just let the overflow happen. And
>> in fact you can still get an overflow with arr + scalar operations, so
>> this doesn't really fix anything.
>>
>> I find the specific handling of unsigned -> signed and float32 ->
>> float64 upcasting confusing as well. (Sure, 2**16 isn't exactly
>> representable as a float32, but it doesn't *overflow*, it just gives
>> you 2.0**16... if I'm using float32 then I presumably don't care that
>> much about exact representability, so it's surprising that numpy is
>> working to enforce it, and definitely a separate decision from what to
>> do about overflow.)
>>
>> None of those threads seem to really get into the question of what the
>> best behaviour here *is*, though.
>>
>> Possibly the most defensible choice is to treat ufunc(arr, scalar)
>> operations as performing an implicit cast of the scalar to arr's
>> dtype, and using the standard implicit casting rules -- which I think
>> means, raising an error if !can_cast(scalar, arr.dtype,
>> casting="safe")
>
>
> I like this suggestion. It may break some existing code, but I think it'd
> be for the best. The current behavior can be very confusing.
>
> -=- Olivier
>


"break some existing code"

I really should set up an email filter for this phrase and have it send
back an email automatically: "Are you nuts?!"

We just resolved an issue where the "safe" casting rule unexpectedly broke
existing code with regards to unplaced operations.  The solution was to
warn about the change in the upcoming release and to throw errors in a
later release.  Playing around with fundemental things like this need to be
done methodically and carefully.

Cheers!
Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Do we want scalar casting to behave as it does at the moment?

2012-11-12 Thread Benjamin Root
On Monday, November 12, 2012, Benjamin Root wrote:

>
>
> On Monday, November 12, 2012, Olivier Delalleau wrote:
>
>> 2012/11/12 Nathaniel Smith 
>>
>>> On Mon, Nov 12, 2012 at 8:54 PM, Matthew Brett 
>>> wrote:
>>> > Hi,
>>> >
>>> > I wanted to check that everyone knows about and is happy with the
>>> > scalar casting changes from 1.6.0.
>>> >
>>> > Specifically, the rules for (array, scalar) casting have changed such
>>> > that the resulting dtype depends on the _value_ of the scalar.
>>> >
>>> > Mark W has documented these changes here:
>>> >
>>> > http://docs.scipy.org/doc/numpy/reference/ufuncs.html#casting-rules
>>> >
>>> http://docs.scipy.org/doc/numpy/reference/generated/numpy.result_type.html
>>> >
>>> http://docs.scipy.org/doc/numpy/reference/generated/numpy.promote_types.html
>>> >
>>> > Specifically, as of 1.6.0:
>>> >
>>> > In [19]: arr = np.array([1.], dtype=np.float32)
>>> >
>>> > In [20]: (arr + (2**16-1)).dtype
>>> > Out[20]: dtype('float32')
>>> >
>>> > In [21]: (arr + (2**16)).dtype
>>> > Out[21]: dtype('float64')
>>> >
>>> > In [25]: arr = np.array([1.], dtype=np.int8)
>>> >
>>> > In [26]: (arr + 127).dtype
>>> > Out[26]: dtype('int8')
>>> >
>>> > In [27]: (arr + 128).dtype
>>> > Out[27]: dtype('int16')
>>> >
>>> > There's discussion about the changes here:
>>> >
>>> >
>>> http://mail.scipy.org/pipermail/numpy-discussion/2011-September/058563.html
>>> >
>>> http://mail.scipy.org/pipermail/numpy-discussion/2011-March/055156.html
>>> >
>>> http://mail.scipy.org/pipermail/numpy-discussion/2012-February/060381.html
>>> >
>>> > It seems to me that this change is hard to explain, and does what you
>>> > want only some of the time, making it a false friend.
>>>
>>> The old behaviour was that in these cases, the scalar was always cast
>>> to the type of the array, right? So
>>>   np.array([1], dtype=np.int8) + 256
>>> returned 1? Is that the behaviour you prefer?
>>>
>>> I agree that the 1.6 behaviour is surprising and somewhat
>>> inconsistent. There are many places where you can get an overflow in
>>> numpy, and in all the other cases we just let the overflow happen. And
>>> in fact you can still get an overflow with arr + scalar operations, so
>>> this doesn't really fix anything.
>>>
>>> I find the specific handling of unsigned -> signed and float32 ->
>>> float64 upcasting confusing as well. (Sure, 2**16 isn't exactly
>>> representable as a float32, but it doesn't *overflow*, it just gives
>>> you 2.0**16... if I'm using float32 then I presumably don't care that
>>> much about exact representability, so it's surprising that numpy is
>>> working to enforce it, and definitely a separate decision from what to
>>> do about overflow.)
>>>
>>> None of those threads seem to really get into the question of what the
>>> best behaviour here *is*, though.
>>>
>>> Possibly the most defensible choice is to treat ufunc(arr, scalar)
>>> operations as performing an implicit cast of the scalar to arr's
>>> dtype, and using the standard implicit casting rules -- which I think
>>> means, raising an error if !can_cast(scalar, arr.dtype,
>>> casting="safe")
>>
>>
>> I like this suggestion. It may break some existing code, but I think it'd
>> be for the best. The current behavior can be very confusing.
>>
>> -=- Olivier
>>
>
>
> "break some existing code"
>
> I really should set up an email filter for this phrase and have it send
> back an email automatically: "Are you nuts?!"
>
> We just resolved an issue where the "safe" casting rule unexpectedly broke
> existing code with regards to unplaced operations.  The solution was to
> warn about the change in the upcoming release and to throw errors in a
> later release.  Playing around with fundemental things like this need to be
> done methodically and carefully.
>
> Cheers!
> Ben Root
>


Stupid autocorrect:  unplaced --> inplace
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Do we want scalar casting to behave as it does at the moment?

2012-11-12 Thread Benjamin Root
On Monday, November 12, 2012, Matthew Brett wrote:

> Hi,
>
> On Mon, Nov 12, 2012 at 8:15 PM, Benjamin Root  wrote:
> >
> >
> > On Monday, November 12, 2012, Olivier Delalleau wrote:
> >>
> >> 2012/11/12 Nathaniel Smith 
> >>>
> >>> On Mon, Nov 12, 2012 at 8:54 PM, Matthew Brett <
> matthew.br...@gmail.com>
> >>> wrote:
> >>> > Hi,
> >>> >
> >>> > I wanted to check that everyone knows about and is happy with the
> >>> > scalar casting changes from 1.6.0.
> >>> >
> >>> > Specifically, the rules for (array, scalar) casting have changed such
> >>> > that the resulting dtype depends on the _value_ of the scalar.
> >>> >
> >>> > Mark W has documented these changes here:
> >>> >
> >>> > http://docs.scipy.org/doc/numpy/reference/ufuncs.html#casting-rules
> >>> >
> >>> >
> http://docs.scipy.org/doc/numpy/reference/generated/numpy.result_type.html
> >>> >
> >>> >
> http://docs.scipy.org/doc/numpy/reference/generated/numpy.promote_types.html
> >>> >
> >>> > Specifically, as of 1.6.0:
> >>> >
> >>> > In [19]: arr = np.array([1.], dtype=np.float32)
> >>> >
> >>> > In [20]: (arr + (2**16-1)).dtype
> >>> > Out[20]: dtype('float32')
> >>> >
> >>> > In [21]: (arr + (2**16)).dtype
> >>> > Out[21]: dtype('float64')
> >>> >
> >>> > In [25]: arr = np.array([1.], dtype=np.int8)
> >>> >
> >>> > In [26]: (arr + 127).dtype
> >>> > Out[26]: dtype('int8')
> >>> >
> >>> > In [27]: (arr + 128).dtype
> >>> > Out[27]: dtype('int16')
> >>> >
> >>> > There's discussion about the changes here:
> >>> >
> >>> >
> >>> >
> http://mail.scipy.org/pipermail/numpy-discussion/2011-September/058563.html
> >>> >
> http://mail.scipy.org/pipermail/numpy-discussion/2011-March/055156.html
> >>> >
> >>> >
> http://mail.scipy.org/pipermail/numpy-discussion/2012-February/060381.html
> >>> >
> >>> > It seems to me that this change is hard to explain, and does what you
> >>> > want only some of the time, making it a false friend.
> >>>
> >>> The old behaviour was that in these cases, the scalar was always cast
> >>> to the type of the array, right? So
> >>>   np.array([1], dtype=np.int8) + 256
> >>> returned 1? Is that the behaviour you prefer?
> >>>
> >>> I agree that the 1.6 behaviour is surprising and somewhat
> >>> inconsistent. There are many places where you can get an overflow in
> >>> numpy, and in all the other cases we just let the overflow happen. And
> >>> in fact you can still get an overflow with arr + scalar operations, so
> >>> this doesn't really fix anything.
> >>>
> >>> I find the specific handling of unsigned -> signed and float32 ->
> >>> float64 upcasting confusing as well. (Sure, 2**16 isn't exactly
> >>> representable as a float32, but it doesn't *overflow*, it just gives
> >>> you 2.0**16... if I'm using float32 then I presumably don't care that
> >>> much about exact representability, so it's surprising that numpy is
> >>> working to enforce it, and definitely a separate decision from what to
> >>> do about overflow.)
> >>>
> >>> None of those threads seem to really get into the question of what the
> >>> best behaviour here *is*, though.
> >>>
> >>> Possibly the moWell, hold on though, I was asking earlier in the
> thread what we
> thought the behavior should be in 2.0 or maybe better put, sometime in
> the future.
>
> If we know what we think the best answer is, and we think the best
> answer is worth shooting for, then we can try to think of sensible
> ways of getting there.
>
> I guess that's what Nathaniel and Olivier were thinking of but they
> can correct me if I'm wrong...
>
> Cheers,
>
> Matthew


I am fine with migrating to better solutions (I have yet to decide on this
current situation, though), but whatever change is adopted must go through
a deprecation process, which was my point.  Outright breaking of code as a
first step is the wrong choice, and I was merely nipping it in the bud.

Cheers!
Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] the fast way to loop over ndarray elements?

2012-11-17 Thread Benjamin Root
On Saturday, November 17, 2012, Chao YUE wrote:

> Dear all,
>
> I need to make a linear contrast of the 2D numpy array "data" from an
> interval to another, the approach is:
> I have another two list: "base" & "target", then I check for each ndarray
> element "data[i,j]",
> if   base[m] =< data[i,j] <= base[m+1], then it will be linearly converted
> to be in the interval of (target[m], target[m+1]),
> using another function called "lintrans".
>
>
> #The way I do is to loop each row and column of the 2D array, and finally
> loop the intervals constituted by base list:
>
> for row in range(data.shape[0]):
> for col in range(data.shape[1]):
> for i in range(len(base)-1):
> if data[row,col]>=base[i] and data[row,col]<=base[i+1]:
>
> data[row,col]=lintrans(data[row,col],(base[i],base[i+1]),(target[i],target[i+1]))
> break  #use break to jump out of loop as the data have to
> be ONLY transferred ONCE.
>
>
> Now the profiling result shows that most of the time has been used in this
> loop over the array ("plot_array_transg"),
> and less time in calling the linear transformation fuction "lintrans":
>
>ncalls tottime  percallcumtimepercall
> filename:lineno(function)
>   180470.1100.000  0.1100.000
> mathex.py:132(lintrans)
>   112.495  12.495   19.061  19.061
> mathex.py:196(plot_array_transg)
>
>
> so is there anyway I can speed up this loop?  Thanks for any suggestions!!
>
> best,
>
> Chao
>
>
If the values in base are ascending, you can use searchsorted() to find out
where values from data can be placed into base while maintaining order.
 Don't know if it is faster, but it would certainly be easier to read.

Cheers!
Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] float32 to float64 casting

2012-11-17 Thread Benjamin Root
On Saturday, November 17, 2012, Charles R Harris wrote:

>
>
> On Sat, Nov 17, 2012 at 1:00 PM, Olivier Delalleau 
> 
> > wrote:
>
>> 2012/11/17 Gökhan Sever > 'cvml', 'gokhanse...@gmail.com');>>
>>
>>>
>>>
>>> On Sat, Nov 17, 2012 at 9:47 AM, Nathaniel Smith 
>>> 
>>> > wrote:
>>>
 On Fri, Nov 16, 2012 at 9:53 PM, Gökhan Sever 
 >>> 'gokhanse...@gmail.com');>>
 wrote:
 > Thanks for the explanations.
 >
 > For either case, I was expecting to get float32 as a resulting data
 type.
 > Since, float32 is large enough to contain the result. I am wondering
 if
 > changing casting rule this way, requires a lot of modification in the
 NumPy
 > code. Maybe as an alternative to the current casting mechanism?
 >
 > I like the way that NumPy can convert to float64. As if these
 data-types are
 > continuation of each other. But just the conversation might happen
 too early
 > --at least in my opinion, as demonstrated in my example.
 >
 > For instance comparing this example to IDL surprises me:
 >
 > I16 np.float32()*5e38
 > O16 2.77749998e+42
 >
 > I17 (np.float32()*5e38).dtype
 > O17 dtype('float64')

 In this case, what's going on is that 5e38 is a Python float object,
 and Python float objects have double-precision, i.e., they're
 equivalent to np.float64's. So you're multiplying a float32 and a
 float64. I think most people will agree that in this situation it's
 better to use float64 for the output?

 -n
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org >>> 'NumPy-Discussion@scipy.org');>
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

>>>
>>> OK, I see your point. Python numeric data objects and NumPy data objects
>>> mixed operations require more attention.
>>>
>>> The following causes float32 overflow --rather than casting to float64
>>> as in the case for Python float multiplication, and behaves like in IDL.
>>>
>>> I3 (np.float32()*np.float32(5e38))
>>> O3 inf
>>>
>>> However, these two still surprises me:
>>>
>>> I5 (np.float32()*1).dtype
>>> O5 dtype('float64')
>>>
>>> I6 (np.float32()*np.int32(1)).dtype
>>> O6 dtype('float64')
>>>
>>
>> That's because the current way of finding out the result's dtype is based
>> on input dtypes only (not on numeric values), and numpy.can_cast('int32',
>> 'float32') is False, while numpy.can_cast('int32', 'float64') is True (and
>> same for int64).
>> Thus it decides to cast to float64.
>>
>
> It might be nice to revisit all the casting rules at some point, but
> current experience suggests that any changes will lead to cries of pain and
> outrage ;)
>
> Chuck
>
>
Can we at least put these examples into the tests?  Also, I think the
bigger issue was that, unlike deprecation of a function, it is much harder
to grep for particular operations, especially in a dynamic language like
python. What were intended as minor bugfixes ended up becoming much larger.

Has the casting table been added to the tests?  I think that will bring
much more confidence and assurances for future changes going forward.

Cheers!
Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Allowing 0-d arrays in np.take

2012-12-04 Thread Benjamin Root
On Tue, Dec 4, 2012 at 8:57 AM, Sebastian Berg
wrote:

> Hey,
>
> Maybe someone has an opinion about this (since in fact it is new
> behavior, so it is undefined). `np.take` used to not allow 0-d/scalar
> input but did allow any other dimensions for the indices. Thinking about
> changing this, meaning that:
>
> np.take(np.arange(5), 0)
>
> works. I was wondering if anyone has feelings about whether this should
> return a scalar or a 0-d array. Typically numpy prefers scalars for
> these cases (indexing would return a scalar too) for good reasons, so I
> guess that is correct. But since I noticed this wondering if maybe it
> returns a 0-d array, I thought I would ask here.
>
> Regards,
>
> Sebastian
>
>
At first, I was thinking that the output type should be based on what the
input type is.  So, if a scalar index was used, then a scalar value should
be returned.  But this wouldn't be true if the array had other dimensions.
So, perhaps it should always be an array.  The only other option is to
mimic the behavior of the array indexing, which wouldn't be a bad choice.

Cheers!
Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Proposal to drop python 2.4 support in numpy 1.8

2012-12-13 Thread Benjamin Root
As a point of reference, python 2.4 is on RH5/CentOS5.  While RH6 is the
current version, there are still enterprises that are using version 5.  Of
course, at this point, one really should be working on a migration plan and
shouldn't be doing new development on those machines...

Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Support for python 2.4 dropped. Should we drop 2.5 also?

2012-12-13 Thread Benjamin Root
On Thu, Dec 13, 2012 at 12:38 PM, Charles R Harris <
charlesr.har...@gmail.com> wrote:

> The previous proposal to drop python 2.4 support garnered no opposition.
> How about dropping support for python 2.5 also?
>
> Chuck
>
>
matplotlib 1.2 supports py2.5.  I haven't seen any plan to move off of that
for 1.3.  Is there a compelling reason for dropping 2.5?

Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Support for python 2.4 dropped. Should we drop 2.5 also?

2012-12-13 Thread Benjamin Root
My apologies... we support 2.6 and above.  +1 on dropping 2.5 support.

Ben

On Thu, Dec 13, 2012 at 1:12 PM, Benjamin Root  wrote:

> On Thu, Dec 13, 2012 at 12:38 PM, Charles R Harris <
> charlesr.har...@gmail.com> wrote:
>
>> The previous proposal to drop python 2.4 support garnered no opposition.
>> How about dropping support for python 2.5 also?
>>
>> Chuck
>>
>>
> matplotlib 1.2 supports py2.5.  I haven't seen any plan to move off of
> that for 1.3.  Is there a compelling reason for dropping 2.5?
>
> Ben Root
>
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Insights / lessons learned from NumPy design

2013-01-09 Thread Benjamin Root
On Wed, Jan 9, 2013 at 9:58 AM, Nathaniel Smith  wrote:

> On Wed, Jan 9, 2013 at 2:53 PM, Alan G Isaac  wrote:
> > I'm just a Python+NumPy user and not a CS type.
> > May I ask a naive question on this thread?
> >
> > Given the work that has (as I understand it) gone into
> > making NumPy usable as a C library, why is the discussion not
> > going in a direction like the following:
> > What changes to the NumPy code base would be required for it
> > to provide useful ndarray functionality in a C extension
> > to Clojure?  Is this simply incompatible with the goal that
> > Clojure compile to JVM byte code?
>
> IIUC that work was done on a fork of numpy which has since been
> abandoned by its authors, so... yeah, numpy itself doesn't have much
> to offer in this area right now. It could in principle with a bunch of
> refactoring (ideally not on a fork, since we saw how well that went),
> but I don't think most happy current numpy users are wishing they
> could switch to writing Lisp on the JVM or vice-versa, so I don't
> think it's surprising that no-one's jumped up to do this work.
>
>
If I could just point out that the attempt to fork numpy for the .NET work
was done back in the subversion days, and there was little-to-no effort to
incrementally merge back changes to master, and vice-versa.  With git as
our repository now, such work may be more feasible.

Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] New numpy functions: filled, filled_like

2013-01-14 Thread Benjamin Root
On Mon, Jan 14, 2013 at 7:38 AM, Pierre Haessig wrote:

> Hi,
>
> Le 14/01/2013 00:39, Nathaniel Smith a écrit :
> > (The nice thing about np.filled() is that it makes np.zeros() and
> > np.ones() feel like clutter, rather than the reverse... not that I'm
> > suggesting ever getting rid of them, but it makes the API conceptually
> > feel smaller, not larger.)
> Coming from the Matlab syntax, I feel that np.zeros and np.ones are in
> numpy for Matlab (and maybe others ?) compatibilty and are useful for
> that. Now that I've been "enlightened" by Python, I think that those
> functions (especially np.ones) are indeed clutter. Therefore I favor the
> introduction of these two new functions.
>
> However, I think Eric's remark about masked array API compatibility is
> important. I don't know what other names are possible ? np.const ?
>
> Or maybe np.tile is also useful for that same purpose ? In that case
> adding a dtype argument to np.tile would be useful.
>
> best,
> Pierre
>
>
I am also +1 on the idea of having a filled() and filled_like() function (I
learned a long time ago to just do a = np.empty() and a.fill() rather than
the multiplication trick I learned from Matlab).  However, the collision
with the masked array API is a non-starter for me.  np.const() and
np.const_like() probably make the most sense, but I would prefer a verb
over a noun.

Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] New numpy functions: filled, filled_like

2013-01-14 Thread Benjamin Root
On Mon, Jan 14, 2013 at 12:27 PM, Eric Firing  wrote:

> On 2013/01/14 6:15 AM, Olivier Delalleau wrote:
> > - I agree the name collision with np.ma.filled is a problem. I have no
> > better suggestion though at this point.
>
> How about "initialized()"?
>

A verb! +1 from me!

For those wondering, I have a personal rule that because functions *do*
something, they really should have verbs for their names.  I have to learn
to read functions like "ones" and "empty" like "give me ones" or "give me
an empty array".

Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] New numpy functions: filled, filled_like

2013-01-14 Thread Benjamin Root
On Mon, Jan 14, 2013 at 1:56 PM, David Warde-Farley <
d.warde.far...@gmail.com> wrote:

> On Mon, Jan 14, 2013 at 1:12 PM, Pierre Haessig
>  wrote:
> > In [8]: tile(nan, (3,3)) # (it's a verb ! )
>
> tile, in my opinion, is useful in some cases (for people who think in
> terms of repmat()) but not very NumPy-ish. What I'd like is a function
> that takes
>
> - an initial array_like "a"
> - a shape "s"
> - optionally, a dtype (otherwise inherit from a)
>
> and broadcasts "a" to the shape "s". In the case of scalars this is
> just a fill. In the case of, say, a (5,) vector and a (10, 5) shape,
> this broadcasts across rows, etc.
>
> I don't think it's worth special-casing scalar fills (except perhaps
> as an implementation detail) when you have rich broadcasting semantics
> that are already a fundamental part of NumPy, allowing for a much
> handier primitive.
>

I have similar problems with "tile".  I learned it for a particular use in
numpy, and it would be hard for me to see it for another (contextually)
different use.

I do like the way you are thinking in terms of the broadcasting semantics,
but I wonder if that is a bit awkward.  What I mean is, if one were to use
broadcasting semantics for creating an array, wouldn't one have just simply
used broadcasting anyway?  The point of broadcasting is to _avoid_ the
creation of unneeded arrays.  But maybe I can be convinced with some
examples.

Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Shouldn't all in-place operations simply return self?

2013-01-17 Thread Benjamin Root
On Thu, Jan 17, 2013 at 8:54 AM, Jim Vickroy  wrote:

>  On 1/16/2013 11:41 PM, Nathaniel Smith wrote:
>
> On 16 Jan 2013 17:54,  wrote:
> > >>> a = np.random.random_integers(0, 5, size=5)
> > >>> b = a.sort()
> > >>> b
> > >>> a
> > array([0, 1, 2, 5, 5])
> >
> > >>> b = np.random.shuffle(a)
> > >>> b
> > >>> b = np.random.permutation(a)
> > >>> b
> > array([0, 5, 5, 2, 1])
> >
> > How do I remember if shuffle shuffles or permutes ?
> >
> > Do we have a list of functions that are inplace?
>
> I rather like the convention used elsewhere in Python of naming in-place
> operations with present tense imperative verbs, and out-of-place operations
> with past participles. So you have sort/sorted, reverse/reversed, etc.
>
> Here this would suggest we name these two operations as either shuffle()
> and shuffled(), or permute() and permuted().
>
>
> I like this (tense) suggestion.  It seems easy to remember.  --jv
>
>
>
And another score for functions as verbs!

:-P

Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] New numpy functions: filled, filled_like

2013-01-17 Thread Benjamin Root
On Thu, Jan 17, 2013 at 5:04 PM, Eric Firing  wrote:

> On 2013/01/17 4:13 AM, Pierre Haessig wrote:
> > Hi,
> >
> > Le 14/01/2013 20:05, Benjamin Root a écrit :
> >> I do like the way you are thinking in terms of the broadcasting
> >> semantics, but I wonder if that is a bit awkward.  What I mean is, if
> >> one were to use broadcasting semantics for creating an array, wouldn't
> >> one have just simply used broadcasting anyway?  The point of
> >> broadcasting is to _avoid_ the creation of unneeded arrays.  But maybe
> >> I can be convinced with some examples.
> >
> > I feel that one of the point of the discussion is : although a new (or
> > not so new...) function to create a filled array would be more elegant
> > than the existing pair of functions "np.zeros" and "np.ones", there are
> > maybe not so many usecases for filled arrays *other than zeros values*.
> >
> > I can remember having initialized a non-zero array *some months ago*.
> > For the anecdote it was a vector of discretized vehicule speed values
> > which I wanted to be initialized with a predefined mean speed value
> > prior to some optimization. In that usecase, I really didn't care about
> > the performance of this initialization step.
> >
> > So my overall feeling after this thread is
> >   - *yes* a single dedicated fill/init/someverb function would give a
> > slightly better API,
> >   -  but *no* it's not important because np.empty and np.zeros covers 95
> > % usecases !
>
> I agree with your summary and conclusion.
>
> Eric
>
>
Can we at least have a np.nans() and np.infs() functions?  This should
cover an additional 4% of use-cases.

Ben Root

P.S. - I know they aren't verbs...
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] New numpy functions: filled, filled_like

2013-01-18 Thread Benjamin Root
On Fri, Jan 18, 2013 at 3:44 AM, Daniele Nicolodi wrote:

> On 17/01/2013 23:27, Mark Wiebe wrote:
> > Would it be too weird or clumsy to extend the empty and empty_like
> > functions to do the filling?
> >
> > np.empty((10, 10), fill=np.nan)
> > np.empty_like(my_arr, fill=np.nan)
>
> Wouldn't it be more natural to extend the ndarray constructor?
>
> np.ndarray((10, 10), fill=np.nan)
>
> It looks more natural to me. In this way it is not possible to have the
> _like extension, but I don't see it as a major drawback.
>
>
> Cheers,
> Daniele
>
>
This isn't a bad idea.  Although, I would wager that most people, like
myself, use np.array() and np.array_like() instead of np.ndarray().  We
should also double-check and see how well that would fit in with the other
contructors like masked arrays and matrix objects.

Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] New numpy functions: filled, filled_like

2013-01-18 Thread Benjamin Root
On Fri, Jan 18, 2013 at 11:36 AM, Daniele Nicolodi wrote:

> On 18/01/2013 15:19, Benjamin Root wrote:
> >
> >
> > On Fri, Jan 18, 2013 at 3:44 AM, Daniele Nicolodi  > <mailto:dani...@grinta.net>> wrote:
> >
> > On 17/01/2013 23:27, Mark Wiebe wrote:
> > > Would it be too weird or clumsy to extend the empty and empty_like
> > > functions to do the filling?
> > >
> > > np.empty((10, 10), fill=np.nan)
> > > np.empty_like(my_arr, fill=np.nan)
> >
> > Wouldn't it be more natural to extend the ndarray constructor?
> >
> > np.ndarray((10, 10), fill=np.nan)
> >
> > It looks more natural to me. In this way it is not possible to have
> the
> > _like extension, but I don't see it as a major drawback.
> >
> >
> > Cheers,
> > Daniele
> >
> >
> > This isn't a bad idea.  Although, I would wager that most people, like
> > myself, use np.array() and np.array_like() instead of np.ndarray().  We
> > should also double-check and see how well that would fit in with the
> > other contructors like masked arrays and matrix objects.
>
> Hello Ben,
>
> I don't really get what you mean with this. np.array() construct a numpy
> array from an array-like object, np.ndarray() accepts a dimensions tuple
> as first parameter, I don't see any np.array_like in the current numpy
> release.
>
> Cheers,
> Daniele
>
>
My bad, I had a brain-fart and got mixed up.  I was thinking of
np.empty().  In fact, I never use np.ndarray(), I use np.empty().  Besides
np.ndarray() being the actual constructor, what is the difference between
them?

Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


  1   2   3   4   5   6   7   >