David Cournapeau ar.media.kyoto-u.ac.jp> writes:
> Unfortunately, we can't, because we would loose generality: we need to
> compute median on any axis, not only the last one. The proper solution
> would be to have a sort/max/min/etc... which knows about nan in numpy,
> which is what Chuck and I a
Peter Saffrey wrote:
>
> I've found that if I just cut nans from the list and use regular numpy median,
> it is quicker - 10 times slower than list median, rather than 35 times slower.
> Could you just wire nanmedian to do it this way?
Unfortunately, we can't, because we would loose generality: w
David Cournapeau ar.media.kyoto-u.ac.jp> writes:
> Still, it is indeed really slow for your case; when I fixed nanmean and
> co, I did not know much about numpy, I just wanted them to give the
> right answer :) I think this can be made faster, specially for your case
> (where the axis along which
David Cournapeau wrote:
>
> The isnan thing is surprising, because the whole point to have a isnan
> is that you can do it without branching. I checked, and numpy does use
> the macro of isnan, not the function (glibc has both).
Ok, see my patch #913 for this. The slowdown is actually specific to
On Sun, Sep 21, 2008 at 12:56 AM, David Cournapeau <
[EMAIL PROTECTED]> wrote:
> David Cournapeau wrote:
> > Anne Archibald wrote:
> >> If users are concerned about performance, it's worth noting that on
> >> some machines nans force a fallback to software floating-point
> >> handling, with a corr
David Cournapeau wrote:
> Anne Archibald wrote:
>> If users are concerned about performance, it's worth noting that on
>> some machines nans force a fallback to software floating-point
>> handling, with a corresponding very large performance hit. This
>> includes some but not all x86 (and I think x
Anne Archibald wrote:
>
> If users are concerned about performance, it's worth noting that on
> some machines nans force a fallback to software floating-point
> handling, with a corresponding very large performance hit. This
> includes some but not all x86 (and I think x86-64) CPUs. How this
> comp
On Sat, Sep 20, 2008 at 11:02 AM, Jake Harris <[EMAIL PROTECTED]>wrote:
>
> Because you're always working with probabilities, there is almost always no
> ambiguity...whenever NaN is encounter, 0 is what is desired.
>
...of course, division presents a good counterexample.
> Bad idea?
>
So prob
(sorry for starting a new thread...I wasn't subscribed yet)
Stéfan van der Walt wrote the following on 09/19/2008 02:10 AM:
>
> So am I. In all my use cases, NaNs indicate trouble.
>
I can provide a use case where NaNs do not indicate trouble. In fact, they
need to be treated as 0. For example
Charles R Harris wrote:
>
>
>
> I would be happy to implement nan sorts if someone can provide me with
> a portable and easy way to detect nans for single, double, and long
> double floats. And not have it fail if the architecture doesn't
> support nans. I think getting all the needed nan detection
2008/9/19 Eric Firing <[EMAIL PROTECTED]>:
> Pierre GM wrote:
>
>>> It seems to me that there are pragmatic reasons
>>> why people work with NaNs for missing values,
>>> that perhaps shd not be dismissed so quickly.
>>> But maybe I am overlooking a simple solution.
>>
>> nansomething solutions tend
On Sat, Sep 20, 2008 at 01:15, Charles R Harris
<[EMAIL PROTECTED]> wrote:
> I would be happy to implement nan sorts if someone can provide me with a
> portable and easy way to detect nans for single, double, and long double
> floats. And not have it fail if the architecture doesn't support nans.
On Fri, Sep 19, 2008 at 11:41 PM, David Cournapeau <
[EMAIL PROTECTED]> wrote:
> Anne Archibald wrote:
> >
> > I, on the other hand, was making specifically that suggestion: users
> > should not use nans to indicate missing values. Users should use
> > masked arrays to indicate missing values.
>
>
Anne Archibald wrote:
>
> I, on the other hand, was making specifically that suggestion: users
> should not use nans to indicate missing values. Users should use
> masked arrays to indicate missing values.
I agree it is the nicest solution in theory, but I think it is
impractical (as mentioned by
2008/9/19 David Cournapeau <[EMAIL PROTECTED]>:
> I guess my formulation was poor: I never use NaN as missing values
> because I never use missing values, which is why I wanted the opinion of
> people who use NaN in a different manner (because I don't have a good
> idea on how those people would l
Robert Kern wrote:
> On Fri, Sep 19, 2008 at 22:25, David Cournapeau
> <[EMAIL PROTECTED]> wrote:
>
>
> How, exactly? ndarray.min() is the where the implementation is.
>
Ah, I keep forgetting those are implemented in the array object, sorry
for that. Now I understand Stefan point. Do I under
Alan G Isaac wrote:
> On 9/19/2008 4:35 AM David Cournapeau apparently wrote:
>> I never use NaN as missing value
>
> What do you use?
>
> Recently I needed to fill a 2d array with values
> from computations that could "go wrong".
> I created an array of NaN and then replaced
> the elements where t
On Fri, Sep 19, 2008 at 22:25, David Cournapeau
<[EMAIL PROTECTED]> wrote:
> Stéfan van der Walt wrote:
>>
>> Why shouldn't we have "nanmin"-like behaviour for the C min itself?
>>
>
> Ah, I was not arguing we should not do it in C, but rather we did not
> have to do in C. The current behavior for
Stéfan van der Walt wrote:
>
> Why shouldn't we have "nanmin"-like behaviour for the C min itself?
>
Ah, I was not arguing we should not do it in C, but rather we did not
have to do in C. The current behavior for nan with functions relying on
ordering is broken; if someone prefer fixing it in C
On Friday 19 September 2008 17:25:53 Alan G Isaac wrote:
> On 9/19/2008 4:54 PM Pierre GM apparently wrote:
> > Another way is
> > ma.array(np.empty(yourshape,yourdtype), mask=True)
> > which should work with earlier versions.
>
> Seems like ``mask`` would be a natural
> keyword for ``ma.empty``?
On 9/19/2008 4:54 PM Pierre GM apparently wrote:
> Another way is
> ma.array(np.empty(yourshape,yourdtype), mask=True)
> which should work with earlier versions.
Seems like ``mask`` would be a natural
keyword for ``ma.empty``?
Thanks,
Alan Isaac
___
N
On Friday 19 September 2008 16:35:23 Alan G Isaac wrote:
> On 9/19/2008 4:54 AM Pierre GM apparently wrote:
> > I know. I was more dreading the time when MaskedArrays would have to be
> > ported to C. In a way, that would probably simplify a few issues. OTOH, I
> > don't really see it happening any
On Friday 19 September 2008 16:28:34 Alan G Isaac wrote:
> On 9/19/2008 11:46 AM Pierre GM apparently wrote:
> a.mask=True
> This is great, but is apparently
> new behavior as of NumPy 1.2?
I'm not sure, sorry. Another way is
ma.array(np.empty(yourshape,yourdtype), mask=True)
which should w
On 9/19/2008 4:54 AM Pierre GM apparently wrote:
> I know. I was more dreading the time when MaskedArrays would have to be
> ported
> to C. In a way, that would probably simplify a few issues. OTOH, I don't
> really see it happening any time soon.
Is this possibly a GSoC sized project?
Alan Isa
On 9/19/2008 11:46 AM Pierre GM apparently wrote:
a.mask=True
This is great, but is apparently
new behavior as of NumPy 1.2?
Alan Isaac
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-d
On 9/19/2008 1:58 PM Robert Kern apparently wrote:
> there are no objects inside non-object arrays. There is
> nothing with identity inside the arrays to compare against.
Got it.
Thanks.
Alan
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
On Friday 19 September 2008 14:01:13 Eric Firing wrote:
> Pierre GM wrote:
> 2) convenient interfacing with extension code in C or C++.
>
> The latter is a factor in the present use of nan in matplotlib; using
> nan for missing values in an array passed into extension code saves
> having to pass a
Pierre GM wrote:
>> It seems to me that there are pragmatic reasons
>> why people work with NaNs for missing values,
>> that perhaps shd not be dismissed so quickly.
>> But maybe I am overlooking a simple solution.
>
> nansomething solutions tend to be considerably faster, that might be one
> re
On Fri, Sep 19, 2008 at 11:34, Alan G Isaac <[EMAIL PROTECTED]> wrote:
> On 9/19/2008 12:02 PM Peter Saffrey apparently wrote:
>> >>> a = array([1,2,nan])
>> >>> nan in a
>> False
>
> Huh. I'm inclined to call this a bug,
> since normal Python behavior is that
> ``in`` should check for identity::
On 9/19/2008 12:02 PM Peter Saffrey apparently wrote:
> >>> a = array([1,2,nan])
> >>> nan in a
> False
Huh. I'm inclined to call this a bug,
since normal Python behavior is that
``in`` should check for identity::
>>> xl = [1.,np.nan]
>>> np.nan in xl
True
Alan
On Fri, Sep 19, 2008 at 1:11 AM, David Cournapeau <
[EMAIL PROTECTED]> wrote:
> Anne Archibald wrote:
> >
> > Well, for example, you might ask that all the non-nan elements be in
> > order, even if you don't specify where the nan goes.
>
>
> Ah, there are two problems, then:
>- sort
>- how
On 9/19/2008 11:46 AM Pierre GM apparently wrote:
> No, but you may do the opposite: just start with an array completely masked,
> and unmasked it as you need:
Very useful example.
I did not understand this possibility.
Alan
___
Numpy-discussion maili
On 9/19/2008 11:46 AM Pierre GM apparently wrote:
> You can't compare NaNs to anything. How do you know this np.miss is a masked
> value, when np.sqrt(-1.) is NaN ?
I thought you could use ``is``.
E.g.,
>>> np.nan == np.nan
False
>>> np.nan is np.nan
True
Alan
On Friday 19 September 2008 12:02:08 Peter Saffrey wrote:
> Alan G Isaac american.edu> writes:
> > Recently I needed to fill a 2d array with values
> > from computations that could "go wrong".
> Should I take the earlier advice and switch to masked arrays?
>
> Peter
Yes. As you've noticed, you c
Alan G Isaac american.edu> writes:
> Recently I needed to fill a 2d array with values
> from computations that could "go wrong".
> I created an array of NaN and then replaced
> the elements where the computation produced
> a useful value. I then applied ``nanmax``,
> to get the maximum of the us
On Friday 19 September 2008 11:36:17 Alan G Isaac wrote:
> On 9/19/2008 11:09 AM Stefan Van der Walt apparently wrote:
> > Masked arrays. Using NaN's for missing values is dangerous. You may
> > do some operation, which generates invalid results, and then you have
> > a mixed bag of missing and i
On 9/19/2008 11:09 AM Stefan Van der Walt apparently wrote:
> Masked arrays. Using NaN's for missing values is dangerous. You may
> do some operation, which generates invalid results, and then you have
> a mixed bag of missing and invalid values.
That rather evades my full question, I think?
On 19 Sep 2008, at 16:07 , Alan G Isaac wrote:
> On 9/19/2008 4:35 AM David Cournapeau apparently wrote:
>> I never use NaN as missing value
>
> What do you use?
Masked arrays. Using NaN's for missing values is dangerous. You may
do some operation, which generates invalid results, and then you
On 9/19/2008 4:35 AM David Cournapeau apparently wrote:
> I never use NaN as missing value
What do you use?
Recently I needed to fill a 2d array with values
from computations that could "go wrong".
I created an array of NaN and then replaced
the elements where the computation produced
a useful va
2008/9/19 David Cournapeau <[EMAIL PROTECTED]>:
> But cannot this be fixed at the python level of the max function ? I
Why shouldn't we have "nanmin"-like behaviour for the C min itself?
I'd rather have a specialised function to deal with the rare kinds of
datasets where NaNs are guaranteed never
Peter Saffrey wrote:
>
> I've posted my test code below, which gives me the results:
>
> $ ./arrayspeed3.py
> list build time: 0.01
> list median time: 0.01
> array nanmedian time: 0.36
>
> I must have done something wrong to hobble nanmedian in this way... I'm quite
> new to numpy, so feel free to
Peter Saffrey wrote:
> Pierre GM gmail.com> writes:
>
>> I think there were some changes on the C side of numpy between 1.0 and 1.1,
>> you may have to recompile scipy and matplotlib from sources. What versions
>> are you using for those 2 packages ?
>>
>
> $ dpkg -l | grep scipy
> ii python-sc
Pierre GM gmail.com> writes:
> I think there were some changes on the C side of numpy between 1.0 and 1.1,
> you may have to recompile scipy and matplotlib from sources. What versions
> are you using for those 2 packages ?
>
$ dpkg -l | grep scipy
ii python-scipy
David Cournapeau ar.media.kyoto-u.ac.jp> writes:
> It may be that nanmedian is slow. But I would sincerly be surprised if
> it were slower than python list, except for some pathological cases, or
> maybe a bug in nanmedian. What do your data look like ? (size, number of
> nan, etc...)
>
I've po
Peter Saffrey wrote:
>
> I rejoiced when I saw this answer, because it looks like a function I can just
> drop in and it works. Unfortunately, nanmedian seems to be quite a bit slower
> than just using lists (ignoring nan values from my experiments) and a
> home-brew
> implementation of median. I
On Friday 19 September 2008 05:51:55 Peter Saffrey wrote:
> I would like to try the masked array approach, but the Ubuntu packages for
> scipy and matplotlib depend on numpy. Does anybody know whether I can
> naively do "sudo python setup.py install" on a more modern numpy without
> disturbing sci
David Cournapeau ar.media.kyoto-u.ac.jp> writes:
> You can use nanmean (from scipy.stats):
>
I rejoiced when I saw this answer, because it looks like a function I can just
drop in and it works. Unfortunately, nanmedian seems to be quite a bit slower
than just using lists (ignoring nan values fr
Stéfan van der Walt wrote:
>
> So am I. In all my use cases, NaNs indicate trouble.
Yes, so I would like to have the opinion of people with other usage
than ours.
>
> Because we have x.max() silently ignoring NaNs, which causes a lot of
> head-scratching, swearing and failed experiments.
But ca
2008/9/19 David Cournapeau <[EMAIL PROTECTED]>:
> Stéfan van der Walt wrote:
>>
>> I agree completely.
>
> Me too, but I am extremely biased toward nan is always bogus by my own
> usage of numpy/scipy (I never use NaN as missing value, and nan is
> always caused by divide by 0 and co).
So am I. I
On Friday 19 September 2008 04:31:38 David Cournapeau wrote:
> Pierre GM wrote:
> > That said, numpy.nanmin, numpy.nansum... don't come with the heavy
> > machinery of numpy.ma, and are therefore faster.
> > I'm really going to have to learn C.
>
> FWIW, nanmean/nanmean/etc... are written in python
Stéfan van der Walt wrote:
>
> I agree completely.
Me too, but I am extremely biased toward nan is always bogus by my own
usage of numpy/scipy (I never use NaN as missing value, and nan is
always caused by divide by 0 and co).
I like that sort raise an exception by default with NaN: it breaks the
Pierre GM wrote:
> That said, numpy.nanmin, numpy.nansum... don't come with the heavy machinery
> of numpy.ma, and are therefore faster.
> I'm really going to have to learn C.
>
FWIW, nanmean/nanmean/etc... are written in python,
cheers,
David
___
On Friday 19 September 2008 04:10:24 Anne Archibald wrote:
> (is there a convenience
> function that makes a masked array with a mask everywhere the data is
> nan?).
numpy.ma.fix_invalid, that masks your Nans and Infs and sets the underlying
data to some filling value. That way, you don't carry
2008/9/19 Anne Archibald <[EMAIL PROTECTED]>:
> I think the numpy attitude to nans should be that they are unexpected
> bogus values that signify that something went wrong with the
> calculation somewhere. They can be left in place for most operations,
> but any operation that depends on the value
2008/9/19 Pierre GM <[EMAIL PROTECTED]>:
> On Friday 19 September 2008 03:11:05 David Cournapeau wrote:
>
>> Hm, I am always puzzled when I think about nan handling :) It always
>> seem there is not good answer.
>
> Which is why we have masked arrays, of course ;)
I think the numpy attitude to nan
On Friday 19 September 2008 03:11:05 David Cournapeau wrote:
> Hm, I am always puzzled when I think about nan handling :) It always
> seem there is not good answer.
Which is why we have masked arrays, of course ;)
___
Numpy-discussion mailing list
Numpy
Anne Archibald wrote:
>
> Well, for example, you might ask that all the non-nan elements be in
> order, even if you don't specify where the nan goes.
Ah, there are two problems, then:
- sort
- how median use sort.
For sort, I don't know how sort speed would be influenced by treating
nan.
2008/9/19 David Cournapeau <[EMAIL PROTECTED]>:
> Anne Archibald wrote:
>>
>> That was in amax/amin. Pretty much every other function that does
>> comparisons needs to be fixed to work with nans. In some cases it's
>> not even clear how: where should a sort put the nans in an array?
>
> The problem
Anne Archibald wrote:
>
> That was in amax/amin. Pretty much every other function that does
> comparisons needs to be fixed to work with nans. In some cases it's
> not even clear how: where should a sort put the nans in an array?
The problem is more on how the functions use sort than sort itself i
2008/9/18 David Cournapeau <[EMAIL PROTECTED]>:
> Anne Archibald wrote:
>>
>> I don't think I agree:
>>
>> In [4]: np.median([1,3,nan])
>> Out[4]: 3.0
>>
>> In [5]: np.median([1,nan,3])
>> Out[5]: nan
>>
>> In [6]: np.median([nan,1,3])
>> Out[6]: 1.0
>>
>
> I was referring to the fact that if you h
Anne Archibald wrote:
>
> I don't think I agree:
>
> In [4]: np.median([1,3,nan])
> Out[4]: 3.0
>
> In [5]: np.median([1,nan,3])
> Out[5]: nan
>
> In [6]: np.median([nan,1,3])
> Out[6]: 1.0
>
I was referring to the fact that if you have nan in your array, you
should use nanmean if you want to i
2008/9/18 David Cournapeau <[EMAIL PROTECTED]>:
> Peter Saffrey wrote:
>>
>> Is this the correct behavior for median with nan?
>
> That's the expected behavior, at least :) (this is also the expected
> behavior of most math packages I know, including matlab and R, so this
> should not be too surpri
Peter Saffrey wrote:
>
> Is this the correct behavior for median with nan?
That's the expected behavior, at least :) (this is also the expected
behavior of most math packages I know, including matlab and R, so this
should not be too surprising if you have used those).
> Is there a fix for
> thi
On Thu, Sep 18, 2008 at 12:23 PM, Pierre GM <[EMAIL PROTECTED]> wrote:
> On Thursday 18 September 2008 13:31:18 Peter Saffrey wrote:
> > The version in the Ubuntu package repository. It says 1:1.0.4-6ubuntu3.
>
> So it's 1.0 ? It's fairly old, that would explain.
>
> > > if you don't give an axis
On Thursday 18 September 2008 13:31:18 Peter Saffrey wrote:
> The version in the Ubuntu package repository. It says 1:1.0.4-6ubuntu3.
So it's 1.0 ? It's fairly old, that would explain.
> > if you don't give an axis
> > parameter, you should get the median of the flattened array, therefore a
> > s
On Thu, Sep 18, 2008 at 11:31 AM, Peter Saffrey <[EMAIL PROTECTED]> wrote:
> Pierre GM gmail.com> writes:
>
> > Mmh, typo?
> >
>
> Yes, apologies. I was aiming for thorough, but ended up just careless. It's
> been
> a long day.
>
> > Ohoh. What version of numpy are you using ?
>
> The version in
Pierre GM gmail.com> writes:
> Mmh, typo?
>
Yes, apologies. I was aiming for thorough, but ended up just careless. It's been
a long day.
> Ohoh. What version of numpy are you using ?
The version in the Ubuntu package repository. It says 1:1.0.4-6ubuntu3.
> if you don't give an axis
> param
On Thursday 18 September 2008 10:59:12 Peter Saffrey wrote:
> I had looked at masked arrays, but couldn't quite get them to work.
That's unfortunate.
> >>> from numeric import *
Mmh, typo?
> >>> from pylab import rand
> >>> a = rand(10,3)
> >>> a[a > 0.8] = nan
> >>> m = ma.masked_array(a,
Thu, 9/18/08, Peter Saffrey <[EMAIL PROTECTED]> wrote:
> From: Peter Saffrey <[EMAIL PROTECTED]>
> Subject: Re: [Numpy-discussion] Medians that ignore values
> To: numpy-discussion@scipy.org
> Date: Thursday, September 18, 2008, 10:59 AM
> physics.ucf.edu> writes:
>
physics.ucf.edu> writes:
> Currently the only way you can handle NaNs is by using masked arrays.
> Create a mask by doing isfinite(a), then call the masked array
> median(). There's an example here:
>
> http://sd-2116.dedibox.fr/pydocweb/doc/numpy.ma/
>
I had looked at masked arrays, bu
> You might want to try isfinite() to first remove nan, +/- infinity
> before doing that.
> numpy.median(a[numpy.isfinite(a)])
We just had this discussion a month or two ago, I think even on this
list, and continued it at the SciPy conference.
The problem with
numpy.median(a[numpy.isfinite(a)])
Nadav Horesh wrote:
> I think you need to use masked arrays.
>
> Nadav
>
>
> -הודעה מקורית-
> מאת: [EMAIL PROTECTED] בשם Peter Saffrey
> נשלח: ה 18-ספטמבר-08 14:27
> אל: numpy-discussion@scipy.org
> נושא: [Numpy-discussion] Medians that ignore values
>
I think you need to use masked arrays.
Nadav
-הודעה מקורית-
מאת: [EMAIL PROTECTED] בשם Peter Saffrey
נשלח: ה 18-ספטמבר-08 14:27
אל: numpy-discussion@scipy.org
נושא: [Numpy-discussion] Medians that ignore values
I have data from biological experiments that is represented as a list of
I have data from biological experiments that is represented as a list of
about 5000 triples. I would like to convert this to a list of the median
of each triple. I did some profiling and found that numpy was much about
12 times faster for this application than using regular Python lists and
a l
74 matches
Mail list logo