[Numpy-discussion] Floating Point Difference between numpy and numarray

2008-09-03 Thread Hanni Ali
Hi,

I have encountered a worrying problem, during migration of software from
numarray to numpy, perhaps someone could help me determine how this could be
addressed.

I have a large array or values 1 long 12 items per line. The matrix
contains floats, dtype=float32 in numpy and type=Float32 in numarray.

When I perform a mean of one of the columns we observe a discrepancies in
the output values.

numarray:
>>> port_result.agg_matrix._array[::,2].mean()
193955925.49500328

numpy:

>>> port_result.agg_matrix._array[::,2].mean()
193954536.48896

If we examine a specific line in the matrix the arrays appear identical:

numarray:
>>> port_result.agg_matrix[0]
array([  2.11549568e+08,   4.03735232e+08,   8.47466400e+07,
 3.99625600e+07,   7.99853550e+06,   6.68247100e+06,
 0.e+00,   1.00018790e+07,   3.43065200e+07,
 1.75261240e+07,   4.89246450e+06,   2.06562788e+06], type=Float32)

numpy:
>>> port_result.agg_matrix[0]
array([  2.11549568e+08,   4.03735232e+08,   8.47466400e+07,
 3.99625600e+07,   7.99853550e+06,   6.68247100e+06,
 0.e+00,   1.00018790e+07,   3.43065200e+07,
 1.75261240e+07,   4.89246450e+06,   2.06562788e+06], dtype=float32)

However when examining a specific item numpy appears to report a value to 8
significant figures regardless of the true value, whereas numarray reported
the full value, however if I cast the output as a float the full value is
present, just not being output. Could this explain the difference in the
mean values? How can I get numpy to always provide the exact value in the
array, so behave in the same manner as numarray?

numarray:
>>> port_result.agg_matrix[0][4]
7998535.5
>>> port_result.agg_matrix[0][11]
2065627.875

numpy:
>>> port_result.agg_matrix[0][4]
7998535.5
>>> port_result.agg_matrix[0][11]
2065627.9
>>> float(port_result.agg_matrix[0][4])
7998535.5
>>> float(port_result.agg_matrix[0][11])
2065627.875

I appreciate any help anyone can give, thank you.

Hanni
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Floating Point Difference between numpy and numarray

2008-09-03 Thread Matthieu Brucher
Hi,

I can't help you with the first issues, but the display has nothing to
do with the quality of the computation. Numpy only prints a part of a
float value, but fir the computations, it obviously uses the correct
value. All this can be parametrized by using set_printoptions().

Matthieu

2008/9/3, Hanni Ali <[EMAIL PROTECTED]>:
> Hi,
>
> I have encountered a worrying problem, during migration of software from
> numarray to numpy, perhaps someone could help me determine how this could be
> addressed.
>
> I have a large array or values 1 long 12 items per line. The matrix
> contains floats, dtype=float32 in numpy and type=Float32 in numarray.
>
> When I perform a mean of one of the columns we observe a discrepancies in
> the output values.
>
> numarray:
> >>> port_result.agg_matrix._array[::,2].mean()
> 193955925.49500328
>
> numpy:
>
> >>> port_result.agg_matrix._array[::,2].mean()
> 193954536.48896
>
> If we examine a specific line in the matrix the arrays appear identical:
>
> numarray:
> >>> port_result.agg_matrix[0]
> array([  2.11549568e+08,   4.03735232e+08,   8.47466400e+07,
>  3.99625600e+07,   7.99853550e+06,   6.68247100e+06,
>  0.e+00,   1.00018790e+07,   3.43065200e+07,
>  1.75261240e+07,   4.89246450e+06,   2.06562788e+06], type=Float32)
>
> numpy:
> >>> port_result.agg_matrix[0]
> array([  2.11549568e+08,   4.03735232e+08,   8.47466400e+07,
>  3.99625600e+07,   7.99853550e+06,   6.68247100e+06,
>  0.e+00,   1.00018790e+07,   3.43065200e+07,
>  1.75261240e+07,   4.89246450e+06,   2.06562788e+06], dtype=float32)
>
> However when examining a specific item numpy appears to report a value to 8
> significant figures regardless of the true value, whereas numarray reported
> the full value, however if I cast the output as a float the full value is
> present, just not being output. Could this explain the difference in the
> mean values? How can I get numpy to always provide the exact value in the
> array, so behave in the same manner as numarray?
>
> numarray:
> >>> port_result.agg_matrix[0][4]
> 7998535.5
> >>> port_result.agg_matrix[0][11]
> 2065627.875
>
> numpy:
> >>> port_result.agg_matrix[0][4]
> 7998535.5
> >>> port_result.agg_matrix[0][11]
> 2065627.9
> >>> float(port_result.agg_matrix[0][4])
> 7998535.5
> >>> float(port_result.agg_matrix[0][11])
> 2065627.875
>
> I appreciate any help anyone can give, thank you.
>
> Hanni
>
>
>
> ___
> Numpy-discussion mailing list
> Numpy-discussion@scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion
>
>


-- 
French PhD student
Information System Engineer
Website: http://matthieu-brucher.developpez.com/
Blogs: http://matt.eifelle.com and http://blog.developpez.com/?blog=92
LinkedIn: http://www.linkedin.com/in/matthieubrucher
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Floating Point Difference between numpy and numarray

2008-09-03 Thread Hanni Ali
Hi Matthieu,

I thought as much, regarding the computations, but was just presenting the
information.

Thanks for the set_printoptions but it doesn't seem to apply when accessing
a specific item:

>>> numpy.set_printoptions(precision=12)
>>> port_result.agg_matrix[0]
array([  2.11549568e+08,   4.03735232e+08,   8.47466400e+07,
 3.99625600e+07,   7.99853550e+06,   6.68247100e+06,
 0.e+00,   1.00018790e+07,   3.43065200e+07,
 1.75261240e+07,   4.89246450e+06,   2.065627875000e+06],
dtype=float32)
>>> port_result.agg_matrix[0][11]
2065627.9

No change in the vale output from a specific item in the matrix. Am I
missing something?

Hanni


2008/9/3 Matthieu Brucher <[EMAIL PROTECTED]>

> Hi,
>
> I can't help you with the first issues, but the display has nothing to
> do with the quality of the computation. Numpy only prints a part of a
> float value, but fir the computations, it obviously uses the correct
> value. All this can be parametrized by using set_printoptions().
>
> Matthieu
>
> 2008/9/3, Hanni Ali <[EMAIL PROTECTED]>:
> > Hi,
> >
> > I have encountered a worrying problem, during migration of software from
> > numarray to numpy, perhaps someone could help me determine how this could
> be
> > addressed.
> >
> > I have a large array or values 1 long 12 items per line. The matrix
> > contains floats, dtype=float32 in numpy and type=Float32 in numarray.
> >
> > When I perform a mean of one of the columns we observe a discrepancies in
> > the output values.
> >
> > numarray:
> > >>> port_result.agg_matrix._array[::,2].mean()
> > 193955925.49500328
> >
> > numpy:
> >
> > >>> port_result.agg_matrix._array[::,2].mean()
> > 193954536.48896
> >
> > If we examine a specific line in the matrix the arrays appear identical:
> >
> > numarray:
> > >>> port_result.agg_matrix[0]
> > array([  2.11549568e+08,   4.03735232e+08,   8.47466400e+07,
> >  3.99625600e+07,   7.99853550e+06,   6.68247100e+06,
> >  0.e+00,   1.00018790e+07,   3.43065200e+07,
> >  1.75261240e+07,   4.89246450e+06,   2.06562788e+06],
> type=Float32)
> >
> > numpy:
> > >>> port_result.agg_matrix[0]
> > array([  2.11549568e+08,   4.03735232e+08,   8.47466400e+07,
> >  3.99625600e+07,   7.99853550e+06,   6.68247100e+06,
> >  0.e+00,   1.00018790e+07,   3.43065200e+07,
> >  1.75261240e+07,   4.89246450e+06,   2.06562788e+06],
> dtype=float32)
> >
> > However when examining a specific item numpy appears to report a value to
> 8
> > significant figures regardless of the true value, whereas numarray
> reported
> > the full value, however if I cast the output as a float the full value is
> > present, just not being output. Could this explain the difference in the
> > mean values? How can I get numpy to always provide the exact value in the
> > array, so behave in the same manner as numarray?
> >
> > numarray:
> > >>> port_result.agg_matrix[0][4]
> > 7998535.5
> > >>> port_result.agg_matrix[0][11]
> > 2065627.875
> >
> > numpy:
> > >>> port_result.agg_matrix[0][4]
> > 7998535.5
> > >>> port_result.agg_matrix[0][11]
> > 2065627.9
> > >>> float(port_result.agg_matrix[0][4])
> > 7998535.5
> > >>> float(port_result.agg_matrix[0][11])
> > 2065627.875
> >
> > I appreciate any help anyone can give, thank you.
> >
> > Hanni
> >
> >
> >
> > ___
> > Numpy-discussion mailing list
> > Numpy-discussion@scipy.org
> > http://projects.scipy.org/mailman/listinfo/numpy-discussion
> >
> >
>
>
> --
> French PhD student
> Information System Engineer
> Website: http://matthieu-brucher.developpez.com/
> Blogs: http://matt.eifelle.com and http://blog.developpez.com/?blog=92
> LinkedIn: http://www.linkedin.com/in/matthieubrucher
> ___
> Numpy-discussion mailing list
> Numpy-discussion@scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion
>
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Floating Point Difference between numpy and numarray

2008-09-03 Thread David Cournapeau
Hanni Ali wrote:
> Hi Matthieu,
>
> I thought as much, regarding the computations, but was just presenting
> the information.

Is your matrix available somewhere so that we can reproduce the problem
? Off-hand, I can't see any explanation, but I am not familiar with
numarray, so maybe I am missing something obvious,

cheers,

David
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Floating Point Difference between numpy and numarray

2008-09-03 Thread Sebastian Stephan Berg
Hi,

just guessing here. But numarray seems to calculate the result in a
bigger dataype, while numpy uses float32 which is the input arrays size
(at least I thought so, trying it confused me right now ...). In any
case, maybe the difference will be gone if you
use .mean(dtype='float64') (or whatever dtype numarray actually uses,
which seems to be "numarray.MaximumType(a.type())" where a is the array
to take the mean).

Sebastian

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Floating Point Difference between numpy and numarray

2008-09-03 Thread Hanni Ali
I'm afraid the matrix is not available anywhere and I would not be able to
make it available.

However I can demonstrate by simply generating a random number and filling a
10x10 matrix with it.

I generated a random number in numpy and used that to do the same exercise
in numarray.

In numpy:

>>> matrix = numpy.zeros((10,10),'f')
>>> matrix
array([[ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.],
   [ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.],
   [ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.],
   [ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.],
   [ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.],
   [ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.],
   [ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.],
   [ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.],
   [ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.],
   [ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.]], dtype=float32)
>>> number = numpy.random.random_sample()
>>> number
0.41582016134572475
>>> matrix.setfield(number,'f',0)
>>> matrix
array([[ 0.415820151567,  0.415820151567,  0.415820151567,  0.415820151567,
 0.415820151567,  0.415820151567,  0.415820151567,  0.415820151567,
 0.415820151567,  0.415820151567],
   [ 0.415820151567,  0.415820151567,  0.415820151567,  0.415820151567,
 0.415820151567,  0.415820151567,  0.415820151567,  0.415820151567,
 0.415820151567,  0.415820151567],
   [ 0.415820151567,  0.415820151567,  0.415820151567,  0.415820151567,
 0.415820151567,  0.415820151567,  0.415820151567,  0.415820151567,
 0.415820151567,  0.415820151567],
   [ 0.415820151567,  0.415820151567,  0.415820151567,  0.415820151567,
 0.415820151567,  0.415820151567,  0.415820151567,  0.415820151567,
 0.415820151567,  0.415820151567],
   [ 0.415820151567,  0.415820151567,  0.415820151567,  0.415820151567,
 0.415820151567,  0.415820151567,  0.415820151567,  0.415820151567,
 0.415820151567,  0.415820151567],
   [ 0.415820151567,  0.415820151567,  0.415820151567,  0.415820151567,
 0.415820151567,  0.415820151567,  0.415820151567,  0.415820151567,
 0.415820151567,  0.415820151567],
   [ 0.415820151567,  0.415820151567,  0.415820151567,  0.415820151567,
 0.415820151567,  0.415820151567,  0.415820151567,  0.415820151567,
 0.415820151567,  0.415820151567],
   [ 0.415820151567,  0.415820151567,  0.415820151567,  0.415820151567,
 0.415820151567,  0.415820151567,  0.415820151567,  0.415820151567,
 0.415820151567,  0.415820151567],
   [ 0.415820151567,  0.415820151567,  0.415820151567,  0.415820151567,
 0.415820151567,  0.415820151567,  0.415820151567,  0.415820151567,
 0.415820151567,  0.415820151567],
   [ 0.415820151567,  0.415820151567,  0.415820151567,  0.415820151567,
 0.415820151567,  0.415820151567,  0.415820151567,  0.415820151567,
 0.415820151567,  0.415820151567]], dtype=float32)
>>> matrix.mean()
0.41582069396972654

In numarray:
>>> matrix = numarray.zeros((10,10),'f')
>>> matrix
array([[ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.],
   [ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.],
   [ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.],
   [ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.],
   [ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.],
   [ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.],
   [ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.],
   [ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.],
   [ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.],
   [ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.]], type=Float32)
>>> number = 0.41582015156745911
>>> for i, item in enumerate(matrix):
... matrix[i] = number
...
>>> matrix
array([[ 0.41582015,  0.41582015,  0.41582015,  0.41582015,
 0.41582015,  0.41582015,  0.41582015,  0.41582015,
 0.41582015,  0.41582015],
   [ 0.41582015,  0.41582015,  0.41582015,  0.41582015,
 0.41582015,  0.41582015,  0.41582015,  0.41582015,
 0.41582015,  0.41582015],
   [ 0.41582015,  0.41582015,  0.41582015,  0.41582015,
 0.41582015,  0.41582015,  0.41582015,  0.41582015,
 0.41582015,  0.41582015],
   [ 0.41582015,  0.41582015,  0.41582015,  0.41582015,
 0.41582015,  0.41582015,  0.41582015,  0.41582015,
 0.41582015,  0.41582015],
   [ 0.41582015,  0.41582015,  0.41582015,  0.41582015,
 0.41582015,  0.41582015,  0.41582015,  0.41582015,
 0.41582015,  0.41582015],
   [ 0.41582015,  0.41582015,  0.41582015,  0.41582015,
 0.41582015,  0.41582015,  0.41582015,  0.41582015,
 0.41582015,  0.41582015],
   [ 0.41582015,  0.41582015,  0.41582015,  0.41582015,
 0.41582015,  0.41582015,  0.41582015,  0.41582015,
 0.41582015,  0.41582015],
   [ 0.41582015,  0.41582015,  0.41582015,  0.41582015

Re: [Numpy-discussion] Floating Point Difference between numpy and numarray

2008-09-03 Thread Hanni Ali
Sebastian you legend, that seems to be it.

Thank you very much.

>>> matrix.mean(dtype='float64')
0.41582015156745911

What seems odd is that numpy doesn't do this on it's own...



2008/9/3 Sebastian Stephan Berg <[EMAIL PROTECTED]>

> Hi,
>
> just guessing here. But numarray seems to calculate the result in a
> bigger dataype, while numpy uses float32 which is the input arrays size
> (at least I thought so, trying it confused me right now ...). In any
> case, maybe the difference will be gone if you
> use .mean(dtype='float64') (or whatever dtype numarray actually uses,
> which seems to be "numarray.MaximumType(a.type())" where a is the array
> to take the mean).
>
> Sebastian
>
> ___
> Numpy-discussion mailing list
> Numpy-discussion@scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion
>
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Floating Point Difference between numpy and numarray

2008-09-03 Thread Matthieu Brucher
It should never do some black magic without telling you.
People are concerned by memory consumption, so if you use more memory
than what you think, you can encounter bugs. Least surprise is always
better ;)

Matthieu

2008/9/3, Hanni Ali <[EMAIL PROTECTED]>:
> Sebastian you legend, that seems to be it.
>
> Thank you very much.
>
> >>> matrix.mean(dtype='float64')
> 0.41582015156745911
>
> What seems odd is that numpy doesn't do this on it's own...
>
>
>
> 2008/9/3 Sebastian Stephan Berg <[EMAIL PROTECTED]>
>
> > Hi,
> >
> > just guessing here. But numarray seems to calculate the result in a
> > bigger dataype, while numpy uses float32 which is the input arrays size
> > (at least I thought so, trying it confused me right now ...). In any
> > case, maybe the difference will be gone if you
> > use .mean(dtype='float64') (or whatever dtype numarray actually uses,
> > which seems to be "numarray.MaximumType(a.type())" where
> a is the array
> > to take the mean).
> >
> > Sebastian
> >
> >
> >
> >
> > ___
> > Numpy-discussion mailing list
> > Numpy-discussion@scipy.org
> >
> http://projects.scipy.org/mailman/listinfo/numpy-discussion
> >
>
>
> ___
> Numpy-discussion mailing list
> Numpy-discussion@scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion
>
>


-- 
French PhD student
Information System Engineer
Website: http://matthieu-brucher.developpez.com/
Blogs: http://matt.eifelle.com and http://blog.developpez.com/?blog=92
LinkedIn: http://www.linkedin.com/in/matthieubrucher
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Floating Point Difference between numpy and numarray

2008-09-03 Thread Hanni Ali
Also can you think of a way either dtype is always float64? I have a lot of
functions and to add dtype='float64' would require *loads* of testing,
whereas if I can set it centrally on the matrix or in the environment that
would be so much easier.

Hanni


2008/9/3 Hanni Ali <[EMAIL PROTECTED]>

> Sebastian you legend, that seems to be it.
>
> Thank you very much.
>
> >>> matrix.mean(dtype='float64')
> 0.41582015156745911
>
> What seems odd is that numpy doesn't do this on it's own...
>
>
>
> 2008/9/3 Sebastian Stephan Berg <[EMAIL PROTECTED]>
>
> Hi,
>>
>> just guessing here. But numarray seems to calculate the result in a
>> bigger dataype, while numpy uses float32 which is the input arrays size
>> (at least I thought so, trying it confused me right now ...). In any
>> case, maybe the difference will be gone if you
>> use .mean(dtype='float64') (or whatever dtype numarray actually uses,
>> which seems to be "numarray.MaximumType(a.type())" where a is the array
>> to take the mean).
>>
>> Sebastian
>>
>> ___
>> Numpy-discussion mailing list
>> Numpy-discussion@scipy.org
>> http://projects.scipy.org/mailman/listinfo/numpy-discussion
>>
>
>
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Floating Point Difference between numpy and numarray

2008-09-03 Thread Hanni Ali
Understood, but I would generally be more concerned with accuracy than
memory?


2008/9/3 Matthieu Brucher <[EMAIL PROTECTED]>

> It should never do some black magic without telling you.
> People are concerned by memory consumption, so if you use more memory
> than what you think, you can encounter bugs. Least surprise is always
> better ;)
>
> Matthieu
>
> 2008/9/3, Hanni Ali <[EMAIL PROTECTED]>:
> > Sebastian you legend, that seems to be it.
> >
> > Thank you very much.
> >
> > >>> matrix.mean(dtype='float64')
> > 0.41582015156745911
> >
> > What seems odd is that numpy doesn't do this on it's own...
> >
> >
> >
> > 2008/9/3 Sebastian Stephan Berg <[EMAIL PROTECTED]>
> >
> > > Hi,
> > >
> > > just guessing here. But numarray seems to calculate the result in a
> > > bigger dataype, while numpy uses float32 which is the input arrays size
> > > (at least I thought so, trying it confused me right now ...). In any
> > > case, maybe the difference will be gone if you
> > > use .mean(dtype='float64') (or whatever dtype numarray actually uses,
> > > which seems to be "numarray.MaximumType(a.type())" where
> > a is the array
> > > to take the mean).
> > >
> > > Sebastian
> > >
> > >
> > >
> > >
> > > ___
> > > Numpy-discussion mailing list
> > > Numpy-discussion@scipy.org
> > >
> > http://projects.scipy.org/mailman/listinfo/numpy-discussion
> > >
> >
> >
> > ___
> > Numpy-discussion mailing list
> > Numpy-discussion@scipy.org
> > http://projects.scipy.org/mailman/listinfo/numpy-discussion
> >
> >
>
>
> --
> French PhD student
> Information System Engineer
> Website: http://matthieu-brucher.developpez.com/
> Blogs: http://matt.eifelle.com and http://blog.developpez.com/?blog=92
> LinkedIn: http://www.linkedin.com/in/matthieubrucher
> ___
> Numpy-discussion mailing list
> Numpy-discussion@scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion
>
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Floating Point Difference between numpy and numarray

2008-09-03 Thread David Cournapeau
Hanni Ali wrote:
> I'm afraid the matrix is not available anywhere and I would not be
> able to make it available.
>

Forget it, Sebastian is right. I was confused by the range of the error,
but the error between two floating point numbers is indeed 1e-7 for
float on most runtimes (FLT_EPS in C, which is the minimal value such as
1. + FLT_EPS != 1.)

I am not sure whether we should define the accumulator to a double in
the float case by default; generally, you use float for saving memory
and computing speed, and you lose quite a bit speed-wise by using a
double accumulator. FWIW, matlab behaves as well (not that it is a
justification by itself, but at least it should not surprise matlab users).

cheers,

David
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Floating Point Difference between numpy and numarray

2008-09-03 Thread Matthieu Brucher
By default, numpy uses float64, but you told it to use float32 ;)

Matthieu

2008/9/3, Hanni Ali <[EMAIL PROTECTED]>:
> Also can you think of a way either dtype is always float64? I have a lot of
> functions and to add dtype='float64' would require *loads* of testing,
> whereas if I can set it centrally on the matrix or in the environment that
> would be so much easier.
>
> Hanni
>
>
> 2008/9/3 Hanni Ali <[EMAIL PROTECTED]>
> >
> > Sebastian you legend, that seems to be it.
> >
> >
> > Thank you very much.
> >
> > >>> matrix.mean(dtype='float64')
> > 0.41582015156745911
> >
> > What seems odd is that numpy doesn't do this on it's own...
> >
> >
> >
> >
> > 2008/9/3 Sebastian Stephan Berg <[EMAIL PROTECTED]>
> >
> >
> >
> >
> > > Hi,
> > >
> > > just guessing here. But numarray seems to calculate the result in a
> > > bigger dataype, while numpy uses float32 which is the input arrays size
> > > (at least I thought so, trying it confused me right now ...). In any
> > > case, maybe the difference will be gone if you
> > > use .mean(dtype='float64') (or whatever dtype numarray actually uses,
> > > which seems to be "numarray.MaximumType(a.type())"
> where a is the array
> > > to take the mean).
> > >
> > > Sebastian
> > >
> > >
> > >
> > >
> > > ___
> > > Numpy-discussion mailing list
> > > Numpy-discussion@scipy.org
> > >
> http://projects.scipy.org/mailman/listinfo/numpy-discussion
> > >
> >
> >
>
>
> ___
> Numpy-discussion mailing list
> Numpy-discussion@scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion
>
>


-- 
French PhD student
Information System Engineer
Website: http://matthieu-brucher.developpez.com/
Blogs: http://matt.eifelle.com and http://blog.developpez.com/?blog=92
LinkedIn: http://www.linkedin.com/in/matthieubrucher
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Floating Point Difference between numpy and numarray

2008-09-03 Thread David Cournapeau
Hanni Ali wrote:
> Understood, but I would generally be more concerned with accuracy than
> memory?

It is a tradeof: you can choose accuracy if you want, but by using
float32, you are already kind of hinting that you care about memory and
speed (otherwise, why not using double, which is the default dtype for
almost everything in numpy ?)

cheers,

David
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Floating Point Difference between numpy and numarray

2008-09-03 Thread Hanni Ali
Oh ok, I shall have to find where I did that then. Thanks

2008/9/3 Matthieu Brucher <[EMAIL PROTECTED]>

> By default, numpy uses float64, but you told it to use float32 ;)
>
> Matthieu
>
> 2008/9/3, Hanni Ali <[EMAIL PROTECTED]>:
> > Also can you think of a way either dtype is always float64? I have a lot
> of
> > functions and to add dtype='float64' would require *loads* of testing,
> > whereas if I can set it centrally on the matrix or in the environment
> that
> > would be so much easier.
> >
> > Hanni
> >
> >
> > 2008/9/3 Hanni Ali <[EMAIL PROTECTED]>
> > >
> > > Sebastian you legend, that seems to be it.
> > >
> > >
> > > Thank you very much.
> > >
> > > >>> matrix.mean(dtype='float64')
> > > 0.41582015156745911
> > >
> > > What seems odd is that numpy doesn't do this on it's own...
> > >
> > >
> > >
> > >
> > > 2008/9/3 Sebastian Stephan Berg <[EMAIL PROTECTED]>
> > >
> > >
> > >
> > >
> > > > Hi,
> > > >
> > > > just guessing here. But numarray seems to calculate the result in a
> > > > bigger dataype, while numpy uses float32 which is the input arrays
> size
> > > > (at least I thought so, trying it confused me right now ...). In any
> > > > case, maybe the difference will be gone if you
> > > > use .mean(dtype='float64') (or whatever dtype numarray actually uses,
> > > > which seems to be "numarray.MaximumType(a.type())"
> > where a is the array
> > > > to take the mean).
> > > >
> > > > Sebastian
> > > >
> > > >
> > > >
> > > >
> > > > ___
> > > > Numpy-discussion mailing list
> > > > Numpy-discussion@scipy.org
> > > >
> > http://projects.scipy.org/mailman/listinfo/numpy-discussion
> > > >
> > >
> > >
> >
> >
> > ___
> > Numpy-discussion mailing list
> > Numpy-discussion@scipy.org
> > http://projects.scipy.org/mailman/listinfo/numpy-discussion
> >
> >
>
>
> --
> French PhD student
> Information System Engineer
> Website: http://matthieu-brucher.developpez.com/
> Blogs: http://matt.eifelle.com and http://blog.developpez.com/?blog=92
> LinkedIn: http://www.linkedin.com/in/matthieubrucher
> ___
> Numpy-discussion mailing list
> Numpy-discussion@scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion
>
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Floating Point Difference between numpy and numarray

2008-09-03 Thread Hanni Ali
We used to care about memory when we were running on 32-bit platforms, but
with the move to 64-bit, enabled by the current work, the issue is removed
and I will probably be changing everything for more accuracy.

Thanks

Hanni


2008/9/3 David Cournapeau <[EMAIL PROTECTED]>

> Hanni Ali wrote:
> > Understood, but I would generally be more concerned with accuracy than
> > memory?
>
> It is a tradeof: you can choose accuracy if you want, but by using
> float32, you are already kind of hinting that you care about memory and
> speed (otherwise, why not using double, which is the default dtype for
> almost everything in numpy ?)
>
> cheers,
>
> David
> ___
> Numpy-discussion mailing list
> Numpy-discussion@scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion
>
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] distance matrix and (weighted) p-norm

2008-09-03 Thread Emanuele Olivetti
David Cournapeau wrote:
> Emanuele Olivetti wrote:
>> Hi,
>>
>> I'm trying to compute the distance matrix (weighted p-norm [*])
>> between two sets of vectors (data1 and data2). Example:
>>   
>
> You may want to look at scipy.cluster.distance, which has a bunch of
> distance matrix implementation. I believe most of them have optional
> compiled version, for fast execution.

Thanks for the pointer but the distance subpackage in cluster is about
the distance matrix of vectors in one set of vectors. So the resulting
matrix is symmetric. I need to compute distances between two
different sets of vectors (i.e. a non-symmetric distance matrix).
It is not clear to me how to use it in my case.

Then cluster.distance offers:
1) slow python double for loop for computing each entry of the matrix
2) or fast C implementation (numpy/cluster/distance/src/distance.c).

I guess I need to extend distance.c, then work on the wrapper and then
on distance.py. But after that it would be meaningless to have those
distances under 'cluster', since clustering doesn't need distances between
two sets of vectors.

In my original post I was looking for a fast python/numpy implementation
for my code. In special cases (like p==2, i.e. standard weighted euclidean
distance) there is a superfast implementation (e.g., see "Fastest distance
matrix calc" 2007 thread). But I'm not able to find something similar
for the general case.

Any other suggestions on how to speed up my example?

Thanks,

Emanuele

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] distance matrix and (weighted) p-norm

2008-09-03 Thread David Cournapeau
Emanuele Olivetti wrote:
>
> Thanks for the pointer but the distance subpackage in cluster is about
> the distance matrix of vectors in one set of vectors. So the resulting
> matrix is symmetric. I need to compute distances between two
> different sets of vectors (i.e. a non-symmetric distance matrix).
> It is not clear to me how to use it in my case.
>   

You may need to extend the code, indeed (although I am more or less
responsible for scipy.cluster these days, I have not looked carefully at
all the code in distance yet).

> Then cluster.distance offers:
> 1) slow python double for loop for computing each entry of the matrix
> 2) or fast C implementation (numpy/cluster/distance/src/distance.c).
>
> I guess I need to extend distance.c, then work on the wrapper and then
> on distance.py. But after that it would be meaningless to have those
> distances under 'cluster', since clustering doesn't need distances between
> two sets of vectors.
>   

FWIW, distance is deemed to move to a separate package, because distance
computation is useful in other contexts than clustering.

cheers,

David
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] distance matrix and (weighted) p-norm

2008-09-03 Thread Emanuele Olivetti
David Cournapeau wrote:
> FWIW, distance is deemed to move to a separate package, because distance
> computation is useful in other contexts than clustering.
>
>   

Excellent. I was thinking about something similar. I'll have a look
to the separate package. Please drop an email to this list when
distance will be moved.

Thanks,

Emanuele

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] numpy 1.1.1 fails because of missing md5

2008-09-03 Thread Charles Doutriaux

Hi Robert,

The first email got intercepted because the attachment was too big 
(awaits moderator), so i compressed the log and I resend this email.


I'm attaching my Python build log, can you spot anything? It "seems" 
like md5 is built, i get a very similar log on my machine and i have a 
working import md5.


I'm not sure of what's going on. Usually the build is fine.

C.


Robert Kern wrote:

On Tue, Sep 2, 2008 at 16:40, Charles Doutriaux <[EMAIL PROTECTED]> wrote:
  

Joseph,

Ok all failed because numpy couldn't build... It's looking for md5



That's part of the standard library. Please check your Python installation.

  




Python.LOG.bz2
Description: application/bzip
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] 1.2.0rc1 tagged!

2008-09-03 Thread Alan G Isaac
So the two formatting tests fail, as David warned.
But they are known to fail on Windows, and there
is no msg to that effect.  Might one be added?
Alan Isaac

Python 2.5.1 (r251:54863, Apr 18 2007, 08:51:08) [MSC v.1310 32 bit 
(Intel)] on
win32
Type "help", "copyright", "credits" or "license" for more information.
 >>> import numpy
 >>> numpy.test()
Running unit tests for numpy
NumPy version 1.2.0rc1
NumPy is installed in C:\Python25\lib\site-packages\numpy
Python version 2.5.1 (r251:54863, Apr 18 2007, 08:51:08) [MSC v.1310 32 
bit (Int
el)]
nose version 0.10.1



...FF...

...S
Ignoring "Python was built with 
Visual S
tudio 2003;
extensions must be built with a compiler than can generate compatible 
binaries.
Visual Studio 2003 was not found on this system. If you have Cygwin 
installed,
you can try compiling with MingW32, by passing "-c mingw32" to 
setup.py." (one s
hould fix me in fcompiler/compaq.py)













.
==
FAIL: Check formatting.
--
Traceback (most recent call last):
   File "C:\Python25\Lib\site-packages\numpy\core\tests\test_print.py", 
line 28,
in test_complex_types
 assert_equal(str(t(x)), str(complex(x)))
   File "C:\Python25\Lib\site-packages\numpy\testing\utils.py", line 
183, in asse
rt_equal
 raise AssertionError(msg)
AssertionError:
Items are not equal:
  ACTUAL: '(0+5.9287877500949585e-323j)'
  DESIRED: '(1+0j)'

==
FAIL: Check formatting.
--
Traceback (most recent call last):
   File "C:\Python25\Lib\site-packages\numpy\core\tests\test_print.py", 
line 16,
in test_float_types
 assert_equal(str(t(x)), str(float(x)))
   File "C:\Python25\Lib\site-packages\numpy\testing\utils.py", line 
183, in asse
rt_equal
 raise AssertionError(msg)
AssertionError:
Items are not equal:
  ACTUAL: '0.0'
  DESIRED: '1.0'

==
SKIP: test_umath.TestComplexFunctions.test_branch_cuts_failing
--
Traceback (most recent call last):
   File "C:\Python25\lib\site-packages\nose\case.py", line 203, in runTest
 self.test(*self.arg)
   File "C:\Python25\Lib\site-packages\numpy\testing\decorators.py", 
line 93, in
skipper
 raise nose.SkipTest, 'This test is known to fail'
SkipTest: This test is known to fail

--
Ran 1573 tests in 18.286s

FAILED (failures=2)

 >>>








___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] 1.2.0rc1 tagged!

2008-09-03 Thread Jarrod Millman
On Wed, Sep 3, 2008 at 10:06 AM, Alan G Isaac <[EMAIL PROTECTED]> wrote:
> So the two formatting tests fail, as David warned.
> But they are known to fail on Windows, and there
> is no msg to that effect.  Might one be added?

Absolutely, we will make sure to add a message to that effect and
possibly point to a bug ticket.  Thanks for the suggestion.

-- 
Jarrod Millman
Computational Infrastructure for Research Labs
10 Giannini Hall, UC Berkeley
phone: 510.643.4014
http://cirl.berkeley.edu/
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] numpy 1.1.1 fails because of missing md5

2008-09-03 Thread Robert Kern
On Wed, Sep 3, 2008 at 10:39, Charles Doutriaux <[EMAIL PROTECTED]> wrote:
> Hi Robert,
>
> The first email got intercepted because the attachment was too big (awaits
> moderator), so i compressed the log and I resend this email.
>
> I'm attaching my Python build log, can you spot anything? It "seems" like
> md5 is built, i get a very similar log on my machine and i have a working
> import md5.

md5.py gets installed, but it just (eventually) imports from one of
the extension modules _md5 or _hashlib, neither of which is getting
built. The errors following the line

  building '_hashlib' extension

are relevant. OpenSSL gets used for its hash function implementations
if it is available. The configuration thinks you want it to use
OpenSSL, so it tries to build _hashlib, which fails. If the
configuration thought you didn't want to use OpenSSL, it would try to
build _md5.

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
 -- Umberto Eco
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] at my wits end over an error message...

2008-09-03 Thread Robert Kern
On Sat, Aug 30, 2008 at 22:10, Zachary Pincus <[EMAIL PROTECTED]> wrote:
> Hi Alan,
>
>> Traceback (most recent call last):
>>  File "/usr/local/lib/python2.5/site-packages/enthought.traits-2.0.4-
>> py2.5-linux-i686.egg/enthought/traits/trait_notifiers.py", line 325,
>> in call_1
>>self.handler( object )
>>  File "TrimMapl_1.py", line 98, in _Run_fired
>>outdata = np.array(outdata, dtype=dtypes)
>> TypeError: expected a readable buffer object
>
> This would make it appear that the problem is not with numpy per se,
> but with the traits library, or how you're using it... I'm not too
> familiar with traits, so I can't really provide any advice there.

Probably not. While Traits triggers the execution of the failing code,
the last line is pure numpy, and the error message comes from
numpy.array(). You then do what one normally does to debug a TypeError
being thrown by a function: find out the type and value of the inputs
being given to the function either by print statements or a debugger.

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
 -- Umberto Eco
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] numpy 1.1.1 fails because of missing md5

2008-09-03 Thread Charles Doutriaux
Thanks for spotting the origin, I'll pass this along to our user maybe 
they'll be able to figure out how to build python w/o openssl

C.

Robert Kern wrote:
> On Wed, Sep 3, 2008 at 10:39, Charles Doutriaux <[EMAIL PROTECTED]> wrote:
>   
>> Hi Robert,
>>
>> The first email got intercepted because the attachment was too big (awaits
>> moderator), so i compressed the log and I resend this email.
>>
>> I'm attaching my Python build log, can you spot anything? It "seems" like
>> md5 is built, i get a very similar log on my machine and i have a working
>> import md5.
>> 
>
> md5.py gets installed, but it just (eventually) imports from one of
> the extension modules _md5 or _hashlib, neither of which is getting
> built. The errors following the line
>
>   building '_hashlib' extension
>
> are relevant. OpenSSL gets used for its hash function implementations
> if it is available. The configuration thinks you want it to use
> OpenSSL, so it tries to build _hashlib, which fails. If the
> configuration thought you didn't want to use OpenSSL, it would try to
> build _md5.
>
>   

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion