On 8/4/11 4:46 AM, Kiko wrote:
Hi, all.

Thank you very much for your replies.

I am obtaining some issues. If I use netcdf4-python or scipy.io.netcdf libraries:

In [4]: import netCDF4 as n4
In [5]: from scipy.io <http://scipy.io> import netcdf as nS
In [6]: import numpy as np
In [7]: gebco4 = n4.Dataset('GridOne.grd', 'r')
In [8]: gebcoS = nS.netcdf_file('GridOne.grd', 'r')

Now, if a do:

In [9]: z4 = gebco4.variables['z']

I got no problems and I have:

In [14]: type(z4); z4.shape; z4.size
Out[14]: <type 'netCDF4.Variable'>
Out[14]: (233312401,)
Out[14]: 233312401

But if I do:

In [15]: z4 = gebco4.variables['z'][:]
------------------------------------------------------------
Traceback (most recent call last):
  File "<ipython console>", line 1, in <module>
File "netCDF4.pyx", line 2466, in netCDF4.Variable.__getitem__ (netCDF4.c:22943) File "C:\Python26\lib\site-packages\netCDF4_utils.py", line 278, in _StartCountStride
    n = len(range(beg,end,inc))
MemoryError

I got a memory error.


Kiko: I think the difference may be that when you read the data with netcdf4-python, it tries to unpack the short integers to a float32 array, thereby using much more memory (more than you have available). scipy.io.netcdf is just returning you a numpy array of short integers. I bet if you do

gebco4.set_automaskandscale(False)

before reading the data from the getco4 variable, it will work, since this turns off the auto conversion to float32.

You'll have to do the conversion manually then, at which point you will may run out of memory anyway.

But if a select a smaller array I've got:

In [16]: z4 = gebco4.variables['z'][:10000000]
In [17]: type(z4); z4.shape; z4.size
Out[17]: <type 'numpy.ndarray'>
Out[17]: (10000000,)
Out[17]: 10000000

What's the difference between z4 as a netCDF4.Variable and as a numpy.ndarray?

the netcdf variable object just refers to the data in the file - only when you slice the object is the data read in and converted to a numpy array.

-Jeff

Now, if I use scipy.io.netcdf:

In [18]: zS = gebcoS.variables['z']
In [20]: type(zS); zS.shape
Out[20]: <class 'scipy.io.netcdf.netcdf_variable'>
Out[20]: (233312401,)

In [21]: zS = gebcoS.variables['z'][:]
In [22]: type(zS); zS.shape
Out[22]: <type 'numpy.ndarray'>
Out[22]: (233312401,)

What's the difference between zS as a scipy.io.netcdf.netcdf_variable and as a numpy.ndarray?
Why with scipy.io.netcdf I do not have a MemoryError?

Finally, if I do the following (maybe it's a silly thing do this) using Eric suggestions to clear the cache:

In [32]: zS = gebcoS.variables['z']
In [38]: timeit -n1 -r1 zSS = np.array(zS[:100000000]) # 100.000.000 out of 233.312.401 because I've got a MemoryError
1 loops, best of 1: 73.1 s per loop

(If I use a copy, timeit -n1 -r1 zSS = np.array(zS[:100000000], copy=True), I get a MemoryError and I have to set the size to 50.000.000 but it's quite fast).

Than you very much for your replies and excuse me if some questions are very basic.

Best regards.

***********************************************************************
The results of ncdump -h
netcdf GridOne {
dimensions:
        side = 2 ;
        xysize = 233312401 ;
variables:
        double x_range(side) ;
                x_range:units = "user_x_unit" ;
        double y_range(side) ;
                y_range:units = "user_y_unit" ;
        short z_range(side) ;
                z_range:units = "user_z_unit" ;
        double spacing(side) ;
        short dimension(side) ;
        short z(xysize) ;
                z:scale_factor = 1. ;
                z:add_offset = 0. ;
                z:node_offset = 0 ;

// global attributes:
                :title = "GEBCO One Minute Grid" ;
                :source = "1.02" ;
}

The file is publicly available from: http://www.gebco.net/data_and_products/gridded_bathymetry_data/



_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Reply via email to