On Wed, Jan 25, 2012 at 12:03 AM, Charles R Harris
wrote:
>
>
> On Tue, Jan 24, 2012 at 4:21 PM, Kathleen M Tacina
> wrote:
>>
>> I found something similar, with a very simple example.
>>
>> On 64-bit linux, python 2.7.2, numpy development version:
>>
>> In [22]: a = 4000*np.ones((1024,1024),dtyp
On Tue, Jan 24, 2012 at 4:21 PM, Kathleen M Tacina <
kathleen.m.tac...@nasa.gov> wrote:
> **
> I found something similar, with a very simple example.
>
> On 64-bit linux, python 2.7.2, numpy development version:
>
> In [22]: a = 4000*np.ones((1024,1024),dtype=np.float32)
>
> In [23]: a.mean()
> Ou
On Tue, Jan 24, 2012 at 7:21 PM, eat wrote:
> Hi
>
> On Wed, Jan 25, 2012 at 1:21 AM, Kathleen M Tacina <
> kathleen.m.tac...@nasa.gov> wrote:
>
>> **
>> I found something similar, with a very simple example.
>>
>> On 64-bit linux, python 2.7.2, numpy development version:
>>
>> In [22]: a = 4000*
Note that if you are ok with an approximate solution, and you can assume
your data is somewhat shuffled, a simple online algorithm that uses no
memory consists in:
- choosing a small step size delta
- initializing your percentile p to a more or less random value (a
meaningful guess is better though
thanks for your responses,
because of the size of the dataset I will still end up with the memory
error if I calculate the median for each file, additionally the files are
not all the same size. I believe this memory problem will still arise with
the cumulative distribution calculation and not sure
On Tue, Jan 24, 2012 at 6:22 PM, questions anon
wrote:
> I need some help understanding how to loop through many arrays to calculate
> the 95th percentile.
> I can easily do this by using numpy.concatenate to make one big array and
> then finding the 95th percentile using numpy.percentile but this
On Mon, Jan 16, 2012 at 8:14 AM, Charles R Harris wrote:
>
>
> On Mon, Jan 16, 2012 at 8:52 AM, Charles R Harris <
> charlesr.har...@gmail.com> wrote:
>
>>
>>
>> On Mon, Jan 16, 2012 at 8:37 AM, Bruce Southey wrote:
>>
>>> **
>>> On 01/14/2012 04:31 PM, Charles R Harris wrote:
>>>
>>> I've put up
On Tue, Jan 24, 2012 at 7:29 AM, Kathleen M Tacina <
kathleen.m.tac...@nasa.gov> wrote:
> **
> I was experimenting with np.min_scalar_type to make sure it worked as
> expected, and found some unexpected results for integers between 2**63 and
> 2**64-1. I would have expected np.min_scalar_type(2**
This is probably not the best way to do it, but I think it would work:
Your could take two passes through your data, first calculating and storing
the median for each file and the number of elements in each file. From
those data, you can get a lower bound on the 95th percentile of the
combined da
2012/1/21 Ondřej Čertík
>
>
> Let me know if you figure out something. I think the "mask" thing is
> quite slow, but the problem is that it needs to be there, to catch
> overflows (and it is there in Fortran as well, see the
> "where" statement, which does the same thing). Maybe there is some
>
Hi
On Wed, Jan 25, 2012 at 1:21 AM, Kathleen M Tacina <
kathleen.m.tac...@nasa.gov> wrote:
> **
> I found something similar, with a very simple example.
>
> On 64-bit linux, python 2.7.2, numpy development version:
>
> In [22]: a = 4000*np.ones((1024,1024),dtype=np.float32)
>
> In [23]: a.mean()
I need some help understanding how to loop through many arrays to calculate
the 95th percentile.
I can easily do this by using numpy.concatenate to make one big array and
then finding the 95th percentile using numpy.percentile but this causes a
memory error when I want to run this on 100's of netcd
On Tue, Jan 24, 2012 at 6:32 AM, Søren Gammelmark wrote:
> Dear all,
>
> I was just looking into numpy.einsum and encountered an issue which might
> be worth pointing out in the documentation.
>
> Let us say you wish to evaluate something like this (repeated indices a
> summed)
>
> D[alpha, alphap
On Wed, Jan 25, 2012 at 01:12:06AM +0200, eat wrote:
> Or does the results of calculations depend more on the platform?
Floating point operations often do, sadly (not saying that this is the case
here, but you'd need to try both versions on the same machine [or at least
architecture/bit-width]/sa
I found something similar, with a very simple example.
On 64-bit linux, python 2.7.2, numpy development version:
In [22]: a = 4000*np.ones((1024,1024),dtype=np.float32)
In [23]: a.mean()
Out[23]: 4034.16357421875
In [24]: np.version.full_version
Out[24]: '2.0.0.dev-55472ca'
But, a Windows XP
Hi,
Oddly, but numpy 1.6 seems to behave more consistent manner:
In []: sys.version
Out[]: '2.7.2 (default, Jun 12 2011, 15:08:59) [MSC v.1500 32 bit (Intel)]'
In []: np.version.version
Out[]: '1.6.0'
In []: d= np.load('data.npy')
In []: d.dtype
Out[]: dtype('float32')
In []: d.mean()
Out[]: 30
Sorry for the late answer. But at least for the record:
If you are using eclipse, I assume you have also installed the eclipse plugin
[pydev](http://pydev.org/). Is use it myself, it's good.
Then you have to go to the preferences->pydev->PythonInterpreter and select the
python version you want
I know you wrote that you want "TEXT" files, but never-the-less, I'd like to
point to http://code.google.com/p/h5py/ .
There are viewers for hdf5 and it is stable and widely used.
Samuel
On 24.01.2012, at 00:26, Emmanuel Mayssat wrote:
> After having saved data, I need to know/remember the da
I get the same results as you, Kathy.
*surprised*
(On OS X (Lion), 64 bit, numpy 2.0.0.dev-55472ca, Python 2.7.2.
On 24.01.2012, at 16:29, Kathleen M Tacina wrote:
> I was experimenting with np.min_scalar_type to make sure it worked as
> expected, and found some unexpected results for integers
On 23.01.2012, at 11:23, David Warde-Farley wrote:
>> a = numpy.array(numpy.random.randint(256,size=(500,972)),dtype='uint8')
>> b = numpy.random.randint(500,size=(4993210,))
>> c = a[b]
>> In [14]: c[100:].sum()
>> Out[14]: 0
Same here.
Python 2.7.2, 64bit, Mac OS X (Lion), 8GB RAM,
On Tue, Jan 24, 2012 at 01:02:44PM -0500, David Warde-Farley wrote:
> On Tue, Jan 24, 2012 at 06:37:12PM +0100, Robin wrote:
>
> > Yes - I get exactly the same numbers in 64 bit windows with 1.6.1.
>
> Alright, so that rules out platform specific effects.
>
> I'll try and hunt the bug down when
Course "Python for Scientists and Engineers" in Chicago
===
There will be a comprehensive Python course for scientists and engineers
in Chicago end of February / beginning of March 2012. It consists of a 3-day
intro and a 2-day advanced section.
Thank you Bruce and all,
I knew I was doing something wrong (should have read the mean method
doc more closely). Am of course glad that's so easy understandable.
But: If the error can get so big, wouldn't it be a better idea for the
accumulator to always be of type 'float64' and then convert lat
> You have a million 32-bit floating point numbers that are in the
> thousands. Thus you are exceeding the 32-bitfloat precision and, if you
> can, you need to increase precision of the accumulator in np.mean() or
> change the input dtype:
a.mean(dtype=np.float32) # default and lacks precis
Just what Bruce said.
You can run the following to confirm:
np.mean(data - data.mean())
If for some reason you do not want to convert to float64 you can add the
result of the previous line to the "bad" mean:
bad_mean = data.mean()
good_mean = bad_mean + np.mean(data - bad_mean)
Val
On Tue, Jan
On Jan 24, 2012, at 1:33 PM, K.-Michael Aye wrote:
> I know I know, that's pretty outrageous to even suggest, but please
> bear with me, I am stumped as you may be:
>
> 2-D data file here:
> http://dl.dropbox.com/u/139035/data.npy
>
> Then:
> In [3]: data.mean()
> Out[3]: 3067.024383998
>
I have confirmed this on a 64-bit linux machine running python 2.7.2
with the development version of numpy. It seems to be related to using
float32 instead of float64. If the array is first converted to a
64-bit float (via astype), mean gives an answer that agrees with your
looped-calculation va
On 01/24/2012 12:33 PM, K.-Michael Aye wrote:
> I know I know, that's pretty outrageous to even suggest, but please
> bear with me, I am stumped as you may be:
>
> 2-D data file here:
> http://dl.dropbox.com/u/139035/data.npy
>
> Then:
> In [3]: data.mean()
> Out[3]: 3067.024383998
>
> In [4]:
I know I know, that's pretty outrageous to even suggest, but please
bear with me, I am stumped as you may be:
2-D data file here:
http://dl.dropbox.com/u/139035/data.npy
Then:
In [3]: data.mean()
Out[3]: 3067.024383998
In [4]: data.max()
Out[4]: 3052.4343
In [5]: data.shape
Out[5]: (1000,
On Tue, Jan 24, 2012 at 06:37:12PM +0100, Robin wrote:
> Yes - I get exactly the same numbers in 64 bit windows with 1.6.1.
Alright, so that rules out platform specific effects.
I'll try and hunt the bug down when I have some time, if someone more
familiar with the indexing code doesn't beat me
On Tue, Jan 24, 2012 at 6:24 PM, David Warde-Farley
wrote:
> On Tue, Jan 24, 2012 at 06:00:05AM +0100, Sturla Molden wrote:
>> Den 23.01.2012 22:08, skrev Christoph Gohlke:
>> >
>> > Maybe this explains the win-amd64 behavior: There are a couple of places
>> > in mtrand where array indices and siz
On Tue, Jan 24, 2012 at 06:00:05AM +0100, Sturla Molden wrote:
> Den 23.01.2012 22:08, skrev Christoph Gohlke:
> >
> > Maybe this explains the win-amd64 behavior: There are a couple of places
> > in mtrand where array indices and sizes are C long instead of npy_intp,
> > for example in the randint
On Tue, Jan 24, 2012 at 09:15:01AM +, Robert Kern wrote:
> On Tue, Jan 24, 2012 at 08:37, Sturla Molden wrote:
> > On 24.01.2012 09:21, Sturla Molden wrote:
> >
> >> randomkit.c handles C long correctly, I think. There are different codes
> >> for 32 and 64 bit C long, and buffer sizes are siz
I filed a ticket (#1590).
Thank you for the verification.
Nadav.
From: numpy-discussion-boun...@scipy.org [numpy-discussion-boun...@scipy.org]
On Behalf Of Pierre Haessig [pierre.haes...@crans.org]
Sent: 24 January 2012 16:01
To: numpy-discussion@scip
I was experimenting with np.min_scalar_type to make sure it worked as
expected, and found some unexpected results for integers between 2**63
and 2**64-1. I would have expected np.min_scalar_type(2**64-1) to
return uint64. Instead, I get object. Further experimenting showed
that the largest integ
Dear all,
I was just looking into numpy.einsum and encountered an issue which might
be worth pointing out in the documentation.
Let us say you wish to evaluate something like this (repeated indices a
summed)
D[alpha, alphaprime] = A[alpha, beta, sigma] * B[alphaprime, betaprime,
sigma] * C[beta,
Le 22/01/2012 11:28, Nadav Horesh a écrit :
> >>> special.erf(26.5)
> 1.0
> >>> special.erf(26.6)
> Traceback (most recent call last):
> File "", line 1, in
> special.erf(26.6)
> FloatingPointError: underflow encountered in erf
> >>> special.erf(26.7)
> 1.0
>
I can confirm this same behavi
On 24.01.2012 10:15, Robert Kern wrote:
> There are two different uses of long that you need to distinguish. One
> is for sizes, and one is for parameters and values. The sizes should
> definitely be upgraded to npy_intp. The latter shouldn't; these should
> remain as the default integer type of P
On Tue, Jan 24, 2012 at 09:19, Sturla Molden wrote:
> On 24.01.2012 10:16, Robert Kern wrote:
>
>> I'm sorry, what are you demonstrating there?
>
> Both npy_intp and C long are used for sizes and indexing.
Ah, yes. I think Travis added the multiiter code to cont1_array(),
which does broadcasting,
On 24.01.2012 10:16, Robert Kern wrote:
> I'm sorry, what are you demonstrating there?
Both npy_intp and C long are used for sizes and indexing.
Sturla
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/
On Tue, Jan 24, 2012 at 08:47, Sturla Molden wrote:
> The coding is also inconsistent, compare for example:
>
> https://github.com/numpy/numpy/blob/master/numpy/random/mtrand/mtrand.pyx#L180
>
> https://github.com/numpy/numpy/blob/master/numpy/random/mtrand/mtrand.pyx#L201
I'm sorry, what are yo
On Tue, Jan 24, 2012 at 08:37, Sturla Molden wrote:
> On 24.01.2012 09:21, Sturla Molden wrote:
>
>> randomkit.c handles C long correctly, I think. There are different codes
>> for 32 and 64 bit C long, and buffer sizes are size_t.
>
> distributions.c take C longs as parameters e.g. for the binomi
On 24.01.2012 06:32, Sturla Molden wrote:
> Den 24.01.2012 06:00, skrev Sturla Molden:
>> Both i and length could overflow here. It should overflow on
>> allocation of more than 2 GB. There is also a lot of C longs in the
>> internal state (line 55-105), as well as the other functions.
>
> The use
On 24.01.2012 09:21, Sturla Molden wrote:
> randomkit.c handles C long correctly, I think. There are different codes
> for 32 and 64 bit C long, and buffer sizes are size_t.
distributions.c take C longs as parameters e.g. for the binomial
distribution. mtrand.pyx correctly handles this, but it c
On 24.01.2012 06:32, Sturla Molden wrote:
> The use of C long affects all the C and Pyrex source code in mtrand
> module, not just mtrand.pyx. All of it is fubar on Win64.
randomkit.c handles C long correctly, I think. There are different codes
for 32 and 64 bit C long, and buffer sizes are size
45 matches
Mail list logo